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Abstract 


In  this  report  we  describe  both  algorithms  and  experimental  results  for  what  we 
refer  to  as  center-based  approaches  to  realtime  distributed  resource  allocation. 
Resources  are  modeled  as  agents  that  communicate  with  each  other  to  exchange 
resource  requirements  and  task  information;  one  or  more  distinguished  “center 
agents”  mediate  task  assignments  and  interactions  between  agents.  Agents  are 
also  architected  to  monitor  and  contribute  to  team  commitments.  Several  varia¬ 
tions  of  center-based  allocation  are  explored:  (1)  Dynamic  Mediation  is  an  ap¬ 
proach  which  supports  negotiations  between  agents  in  the  context  of  a  changing 
problem  and  resource  situation  and  in  which  tasks  can  interact  in  both  positive 
and  negative  ways;  (2)  Allocation  Improvement  addresses  the  problem  of  combi¬ 
natorial  resource  allocation;  and  (3)  the  Distributed  Dispatcher  Manager  (DDM) 
supports  the  hierarchical  organization  and  management  of  massive  agent-based 
systems,  on  the  order  of  thousands  of  agents  and  tasks.  We  also  present  a  solution 
to  the  problem  of  communications  management  in  the  context  of  a  multisensor 
allocation  problem. 
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Chapter  1 

Project  overview 


This  report  describes  work  performed  by  SRI  International,  Harvard  University, 
and  Bar-Ilan  University  under  Contract  F30602-99-C-0169  for  the  DARPA  Au¬ 
tonomous  Negotiating  Teams  (ANTS)  Program.  At  Harvard  and  Bar-Ilan  Univer¬ 
sities,  part  of  this  work  was  also  funded  under  NSF  grant  number  IIS9907482. 
Sections  of  Chapter  2  and  Chapter  3  have  appeared  in  the  volume  published 
in  2003  by  Kluwer  Academic  Publishing  entitled  Distributed  Sensor  Networks 
[Lesser  et  al  2003].  The  authors  of  those  chapters  herein  acknowledge  that  copy¬ 
righted  material;  any  reproduction  of  this  report  should  respect  that  copyright. 


1.1  The  problem 

In  this  report,  we  describe  our  contributions  toward  the  solution  of  the  problem  of 
realtime  distributed  resource  allocation.  We  use,  for  the  most  part,  the  distributed 
multi-sensor  challenge  problem  proposed  by  the  ANTS  program  to  motivate  our 
research.  However,  our  contributions  are  not  restricted  to  the  sensor  domain.  We 
also  describe  the  application  of  our  ideas  to  other  domains  when  we  see  that  as 
pedagogically  helpful,  while  at  the  same  time  drawing  connections  to  problems  in 
the  sensor  domain. 

We  model  the  resource  allocation  problem  as  a  multiagent  problem  in  which 
each  resource  is  modeled  as  an  agent  which  can  communicate  with  other  agents  to 
exchange  resource  requirements  or  task  information.  Agent  interactions  generally 
take  the  form  of  message  exchanges  to  support  auction-style  algorithms  in  which 
a  mediator  requests  bids  on  a  task  or  a  collection  of  tasks  and  then  receives  bids 
from  agents.  Each  bid  encapsulates  local  information,  important  to  the  allocation 
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decision,  in  the  form  of  utility  or  cost  estimates. 

Our  focus  has  been  on  problem  settings  and  solutions  with  the  following  char¬ 
acteristics: 

Distributed:  Any  resource  allocation  algorithm  should  be  distributed  in  the  sense 
that  it  should  not  depend  on  some  centralized  repository  of  global  informa¬ 
tion  where  allocation  decisions  must  be  made.  Such  an  assumption  would 
be  overly  restrictive  given  the  constraints  imposed  within  real  world  settings 
in  which  inter-agent  communications  may  be  limited. 

Incremental  and  realtime:  The  time-stressed  nature  of  realworld  problem  do¬ 
mains  precludes  the  possibility  of  computing  optimal  resource  allocations 
before  execution.  Instead,  agents  should  negotiate  partial,  good-enough  al¬ 
locations  which  can  later  be  refined  if  time  permits. 

Flexible  task  allocation:  Task  allocation  mechanisms  should  be  flexible  in  the 
sense  that  potential  allocations  can  be  explored  either  sequentially,  in  terms 
of  possible  task-resource  pairs,  or  combinatorially  in  the  form  of  sets  of 
multiple  tasks  and  resources.  In  the  latter  case,  mechanisms  should  be  able 
to  deal  with  tasks  that  can  interact:  that  is,  in  which  the  cost  of  doing  several 
tasks  is  not  simply  the  sum  of  the  individual  costs  of  each  task. 

Adaptive  resource  allocation:  Dynamic  problem  settings  in  which  new  tasks  or 
resources  can  appear  (or  disappear)  during  the  allocation  process  require 
that  it  not  be  necessary  for  allocation  processes  to  be  re-started  from  scratch 
each  time  the  global  situation  changes. 

Adaptive  communications:  Since  bandwidth  communications  are  assumed  to 
vary,  allocation  algorithms  should  be  able  to  adapt  to  limits  imposed  by 
the  communications  medium. 

Fault  tolerant:  Any  solution  should  be  fault  tolerant  in  the  sense  that  it  should 
be  adaptable  to  resource  loss  during  execution,  as  opposed  to  requiring  that 
the  allocation  be  re-started  from  scratch. 

Scalable:  Any  solution  should  be  scalable  to  very  large  agent  and  task  settings. 

Figure  1. 1  is  an  example  of  realtime  resource  allocation  involving  multi-sensor 
tracking.  The  figure  shows  an  array  of  9  doppler  sensors.  Each  sensor  has  three 
sectors  associated  with  it,  labeled  {1,2,3}.  A  sensor  can  turn  on  a  sector  and  take 


2 


T1 
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both  frequency  and  amplitude  measurements  in  order  to  determine  velocity  and 
distance.  The  more  sectors  that  are  on,  the  greater  the  power  usage.  The  farther 
away  the  target  is  from  the  sensor,  the  lower  the  quality  of  the  measurement.  At 
least  two  sensors  are  necessary  for  estimating  the  location  of  an  object;  three 
sensors  are  desirable  for  obtaining  a  good  quality  estimate.  Sectors  require  a  2 
second  warm  up  time  and  two  objects  which  appear  in  the  same  sector  and  at  the 
same  time  cannot  be  discriminated. 

We  conducted  experiments  on  both  a  physical  sensor  suite  and  a  simulation, 
called  RadSim,  based  on  the  sensor  suite.  RadSim  is  a  simulation  environment 
developed  by  the  Air  Force  Research  Laboratory  to  support  what  will  be  referred 
to  as  the  ANTS  distributed  resource  allocation  challenge  problem.  RadSim  pro¬ 
vides  a  simulated  environment  containing  moving  targets  that  are  to  be  tracked 
(i.e.  their  positions  determined  over  time)  using  simulated  sensor  nodes.  Each 
sensor  node  is  modeled  as  a  3-head  Doppler  radar  unit  and  an  8-channel  commu¬ 
nications  transceiver.  The  sensors  are  controlled  by  external  agents  that  allows 
them  to  set  sensor  parameters,  take  measurements,  and  communicate  with  other 
agents  (For  details,  see  Chapter  2  of  [Lesser  et  al  2003]).  Due  to  the  partial  state 
of  the  ANTS  program  tracking  software  at  the  time  of  completion  of  this  project, 
this  report  focuses  on  results  from  the  simulation. 


1.2  Major  elements  of  our  approach 

Our  approach  to  the  multi-sensor  tracking  problem  involves  three  stages:  (1)  ini¬ 
tial  coalition  formation;  (2)  formation  of  a  future  coalition  based  on  a  projected 
object  path;  and  (3)  refinement  of  an  existing  coalition.  We  use  the  term  coalition 
to  refer  to  a  group  of  agents  who  are  joining  together  to  perform  some  task;  the 
members  of  a  coalition  can  change  as  circumstances  change.  The  initial  coalition 
formation  is  a  very  quick  process  which  assigns  a  group  of  three  agents  to  a  tar¬ 
get.  The  initial  coalition’s  task  is  to  determine  the  position,  direction  and  velocity 
of  the  target.  In  the  second  stage,  one  of  the  agents  in  the  initial  coalition  takes 
that  information  and  projects  the  path  of  the  target  into  the  future  and  then  runs 
an  auction  or,  as  we  will  refer  to  its  variants,  a  center-based  algorithm  on  some 
set  of  agents  that  neighbor  the  projected  path.  The  projected  path  is  represented 
by  a  cone  of  uncertainty:  the  farther  into  the  future,  the  greater  the  uncertainty  in 
the  projected  position.  In  the  final  stage,  a  coalition  can  be  adapted  to  changes 
that  might  occur  in  the  path  of  a  target:  when  a  particular  coalition.  A,  notices  the 
change,  it  informs  the  remaining  agents  in  the  original  coalition  of  that  change  so 
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that  they  can  drop  their  commitments;  A  then  runs  a  new  auction  to  allocate  new 
resources  to  the  new  path. 

Given  a  projected  track,  {(Il,tl),(12,t2),(13,t3),  (14, t4)},  (for  example,  corre¬ 
sponding  to  the  points  1,  2,  3,  and  4  in  track  T1  shown  in  the  figure)  of  location¬ 
time  points,  we  have  studied  four  allocation  schemes:  (1)  a  standard  auction  in 
which  each  point  is  auctioned  to  some  subset  of  nodes;  (2)  a  combinatorial  alloca¬ 
tion  scheme  in  which  sets  of  location-time  points  are  considered  simultaneously; 
(3)  mediation,  in  which  a  mediator  suggests  allocations  to  some  subset  of  nodes  at 
each  round  of  negotiation;  and  (4)  a  variation  in  which  a  mediator  allocates  agents 
to  tasks  through  a  hierarchical  team  organization. 


1.3  Outline  of  report 

The  remainder  of  this  document  is  organized  in  the  following  way.  In  Chap¬ 
ter  2  we  explore  variations  of  center-based  algorithms  for  resource  allocation;  in 
particular,  we  focus  on  a  variant  which  we  call  Dynamic  Mediation  that  directly 
addresses  the  dynamic  and  realtime  aspects  of  distributed  resource  allocation.  In 
that  chapter,  we  also  present  a  combinatorial  allocation  algorithm  as  well  as  an 
agent  architecture  that  supports  monitoring  of  team  commitments.  In  Chapter  3 
we  present  a  system  called  the  Distributed  Dispatcher  Manager  (DDM)  for  man¬ 
aging  massive  agent-based  systems,  on  the  order  of  thousands  of  agents  and  tasks. 
DDM  is  the  first  system  able  to  manage  such  large  collections;  it  does  so  by  orga¬ 
nizing  agents  hierarchically  into  teams.  In  Chapter  4  we  present  a  solution  to  the 
problem  of  communications  management  in  the  context  of  the  ANTS  challenge 
problem.  Finally,  Chapter  5  summarizes  our  contributions. 
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Chapter  2 

Center-based  negotiation 


2.1  Introduction 

This  chapter  introduces  a  class  of  negotiation  protocols  appropriate  for  problem 
domains  in  which  tasks  can  interact  arbitrarily  and  in  which  agents  must  nego¬ 
tiate  over  the  assignment  of  resources  to  tasks  in  dynamically  changing  settings. 
We  use  the  term  negotiation  to  simply  refer  to  any  distributed  process  through 
which  agents  can  agree  on  an  efficient  apportionment  of  tasks  among  themselves. 
Our  work  has  compelled  us  to  reject  various  assumptions  commonly  made  in  the 
negotiation  literature:  these  include  assumptions  that  (1)  the  context  in  which  a 
negotiation  or  bid  is  made  is  irrelevant  to  the  negotiation  and,  consequently,  that 
task  costs  or  utilities  can  be  assumed  to  be  additive:  we  allow  instead  for  the 
possibility  of  positive  and  negative  task  interactions;  (2)  the  environment  remains 
static  during  a  negotiation:  we  allow  for  the  possibility  that  important  changes  can 
occur  during  a  negotiation  that  can  affect  the  results  of  that  negotiation;  and  (3)  all 
changes  in  the  environment  can  be  anticipated  during  negotiation:  in  any  realistic 
domain,  the  world  may  change  in  unexpected  ways  at  execution  time,  requiring 
that  a  solution  be  adapted  to  those  changes.  We  restrict  ourselves  to  problems  in¬ 
volving  resource-bounded  agents  comprising  cooperative  teams.  The  requirement 
for  resource-boundedness  means  that  negotiation  protocols  must  be  temporally 
constrained  in  some  way  and  that  an  optimal  allocation  will  not  always  be  possi¬ 
ble.  The  restriction  to  cooperative  teams  corresponds  to  a  restriction  to  agents  that 
can  be  assumed  to  be  honest  and  non-competitive. 

The  focus  on  non-additive  domains  is  an  admission  that  tasks  can  interact  in 
interesting  and  significant  ways  and  that  such  interactions  should  be  kept  in  mind 
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during  a  negotiation.  Therefore  a  deeision  to  allocate  a  task  to  a  particular  agent 
cannot  necessarily  be  made  independently  of  a  task  allocation  to  another  (or  even 
the  same)  agent.  For  example,  in  the  sensor  challenge  problem,  if  a  deactivated 
emitter  is  activated,  the  beam  is  unstable  and  will  not  give  reliable  measurements 
for  2  seconds;  therefore,  if  one  task  is  immediately  followed  by  another  in  the 
same  sector,  the  beam  will  not  require  the  2  second  warmup;  this  corresponds  to  a 
positive  task  interaction.  As  another  example,  consider  that  only  one  of  the  three 
detectors  on  a  sensor  can  be  scanned  at  a  given  time  and  each  scan  takes  between 
0.6-1. 8  seconds;  therefore,  two  sequential  tasks  that  are  less  than  0.6  seconds  apart 
and  occur  in  separate  sectors,  will  interact  negatively.  Consider  another  domain, 
such  as  a  delivery  domain.  When  negotiating  whether  two  tasks  can  be  taken 
on  by  the  same  agent  (for  example,  a  delivery  to  two  different  locations)  agents 
must  consider  whether  those  tasks  interact  in  such  a  way  that  the  joint  cost  of 
performing  both  tasks  might  be  greater  or  less  than  performing  each  separately 
(for  example,  in  the  former  case,  if  the  delivery  points  are  in  opposite  directions 
entailing  separate  trips). 

The  rejection  of  the  assumption  that  the  world  remains  static  during  a  negotia¬ 
tion  requires  that  one  develop  negotiation  protocols  that  allow  a  negotiation  to  be 
adapted  to  such  changes,  rather  than  requiring  that  the  negotiation  be  re-started. 
The  rejection  of  the  assumption  that  the  world  also  remain  static  during  execu¬ 
tion  requires  that  agents  negotiating  be  architected  to  monitor  the  management 
of  deployed  resources  and  flexibly  respond  to  unexpected  changes  in  a  way  that 
respects  team-centered  concerns:  that  is,  an  agent’s  commitments  should  not  be 
limited  to  those  resulting  just  from  the  negotiation  but  should  extend  dynamically 
to  the  team. 

We  shall,  for  the  most  part,  use  the  distributed  sensor  domain  to  motivate  our 
design  decisions;  this  has  also  served  as  an  experimental  testbed  for  validation. 
However,  our  contributions  toward  the  solution  of  these  problems  is  not  restricted 
to  the  sensor  domain.  Consequently,  we  will  describe  the  application  of  our  ideas 
to  related  domains  when  we  see  that  as  pedagogically  helpful,  while  at  the  same 
time  drawing  connections  to  problems  in  the  sensor  domain. 

Our  focus  in  this  chapter  is  on  one  important  class  of  negotiation  algorithms 
which  we  will  refer  to  as  center-based  algorithms.  Examples  include  sequential 
auctions,  combinatorial  auctions,  and  contract  nets.  When  the  objects  over  which 
agents  negotiate  correspond  to  tasks  rather  than  physical  objects  (as  in  auctions), 
contract  nets  are  equivalent  to  auctions.  Hence,  we  shall  view  them  interchange¬ 
ably.  At  the  heart  of  such  mechanisms  is  a  center  agent  (for  example,  an  auction¬ 
eer  or  contractor)  that  collects  bids  on  proposed  allocations.  Each  bid  is  meant  to 
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compactly  encapsulate  important  loeal  information  (such  as  utility  information) 
that  can  subsequently  be  made  use  of  by  the  eenter  in  its  deeision  on  the  best  allo¬ 
cation.  A  center-based  approaeh  is  very  different  from  approaches  in  whieh  some 
eentral  eoordinator  has  aeeess  to  up-to-date  information  regarding  the  local  states 
of  agents  and  then  uses  that  knowledge  to  eompute  optimal  alloeations.  Further¬ 
more,  the  amount  of  information  contained  in  agents’  loeal  states  may  be  large, 
thus  rendering  eentralization  infeasible,  particularly  in  cases  of  communieation 
delays  and  system  faults. 

Combinatorial  auctions  have  been  suggested  as  a  promising  method  for  ex¬ 
ploring  allocation  of  items  that  interact:  agents  have  the  freedom  to  ehoose  par- 
tieular  bundles  of  items.  However,  as  we  shall  see,  they  leave  unanswered  the 
question  of  how  best  to  ehoose  the  bundles  on  whieh  to  bid  and  also  assume  that 
an  agent’s  value  does  not  interact  or  depend  on  the  bids  of  other  agents.  Within 
the  broad  notion  of  a  center-based  negotiation  meehanism,  many  variations  are 
possible;  those  variations  are  not  restrieted  to  auctions.  In  fact,  we  will  present 
one  variation,  whieh  we  call  mediation,  that  usefully  allows  eonsiderations  about 
the  eontext  in  whieh  a  bid  is  made. 


2.2  Center-based  task  assignment 

We  define  a  eenter-based  task  assignment  problem  in  the  following  way.  We  de- 
seribe  a  system  in  terms  of  a  set  of  possible  “runs”  or  system  exeeutions,  in  the 
form  of  state  transitions  for  each  agent,  ineluding  a  distinguished  run  representing 
the  aetual  exeeution.  Formally,  we  have: 

Definition  2.2.1  Let  3?  stand  for  the  set  of  real  numbers.  A  task  alloeation  system, 
M,  is  represented  as  a  five-tuple,  M  =  {A,  T,  u,  P),  where, 

1.  A  =  {fli, ...  ,an}  is  a  set  ofn  agents  with  some  agent  designated  as  the  the 
mediator, 

2.  T  =  {fi, . . . ,  tm}  is  a  set  ofm  tasks, 

3.  u  :  A  X  2'^  ^  {oo}  is  a  value  function  that  returns  the  value  which  an 
agent  associates  with  a  particular  subset  of  tasks, 

4.  An  assignment,  P,  or  partition,  of  size  n  on  the  set  of  tasks  T  such  that  P  = 
(Pi,  P2,  ...,Pn),  where  Pj  contains  the  set  of  items  assigned  to  agent  Uj. 

We  refer  to  eaeh  element  of  P  as  a  proposal.  We  will  sometimes  represent  a 
proposal  as  an  ordered  m-tuple  of  agents,  the  f-th  element  of  whieh  corresponds  to 
the  agent  assigned  to  perform  task  i.  The  special  proposal  0  will  represent  the  null 
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proposal,  which  corresponds  to  the  situation  in  whieh  no  assignment  of  tasks  is 
possible.  For  example,  P5  =  (oi,  05, 03,  oi)  eorresponds  to  the  alloeation  in  whieh 
ti  is  assigned  to  agent  oi,  t2  to  agent  05,  ts  to  agent  03,  and  so  on.  The  level  of 
detail  eneoded  by  eaeh  proposal  is  domain  dependent.  The  algorithms  deseribed 
in  this  paper  assume  that  eaeh  proposal  eneodes  suffieient  detail  to  enable  eaeh 
agent  to  evaluate  its  eost  function. 

The  value  funetion  ean  represent,  for  example,  the  utility  or  eost  attaehed  to 
a  task,  or  perhaps  some  multi-objeetive  utility  funetion.  In  the  experiments  de¬ 
seribed  in  this  ehapter,  we  assoeiate  either  eost  or  utility  to  the  value  funetion, 
depending  on  the  particular  focus  of  the  experiment.  Where  the  value  funetion 
ranges  over  a  set  of  tasks  eonsisting  of  only  one  element,  we  use  the  same  nota¬ 
tion  and  refer  only  to  the  single  element;  we  assume  throughout  that  this  seeondary 
use  is  elear  from  the  eontext.  There  is  some  globally  defined  objeetive  funetion, 
/,  that  determines  the  desirability  of  an  assignment  based  on  the  value  that  eaeh 
agent  aseribes  to  the  items  to  whieh  it  is  assigned. 

We  assume  the  group  objeetive  is  soeial  welfare  maximization  and  thus  the 
objeetive  funetion  is  given  by: 

=  '^u{a,p) 
aeA 


where  p  G  P.^ 

The  negotiation  problem  is  that  of  ehoosing  an  element  p*  of  P  that  maximizes 
the  objeetive  funetion: 

p*  =  argmax /(p,  yl). 
pGP 

The  proposal  ehosen  is  ealled  the  outcome  of  the  negotiation. 

The  amount  of  time  available  may  or  may  not  be  known  to  the  agents  in  ad- 
vanee.  For  instanee,  agents  may  need  to  end  their  negotiation  and  begin  aeting 
in  a  given  number  of  milliseeonds,  or  they  may  need  to  begin  aeting  when  some 
event  oeeurs  at  an  unspeeified  time  in  the  future. 

Both  Mediation  and  combinatorial  auctions  are  examples  of  algorithms  that 
ean  be  used  to  solve  the  assignment  problem.  They  are  both  members  of  a  elass 
of  algorithms  we  will  call  center-based  assignment  (CBA)  algorithms.  In  a  CBA 

'a  common  alternative  objective  is  Pareto  effi  ciency  where  the  objective  function  can  be  de- 
h  ned  as: 

f(  /I'l  _  /  0  if  3^  e  P  such  that  Va  e  A,  M(a,  5)  >  M(a,p) 

J\P^  1  j  otherwise 
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let  ^  0 
loop 

construct  an  announcement  a{H) 
send  a{H)  to  each  agent 
incorporate  a{H)  into  H 
receive  “bid”  bj{a,  H)  from  each  agent 
incorporate  hj{a,  H)  into  H 
compute  P{H) 

UNTIL  terminate(H)  or  receive  terminate  signal 
send  final  assignment  P{H)  to  agents 


Figure  2.1:  Center-based  Assignment  Algorithm 


algorithm,  one  agent  c  is  designated  as  the  center.  This  agent  may  be  a  member  of 
A;  it  implements  the  algorithm  provided  in  Figure  2. 1 .  First,  it  initializes  a  history 
variable  that  keeps  track  of  agents’  responses.  It  then  loops  though  a  series  of 
cycles;  each  cycle  involves  constructing  an  announcement,  communicating  that 
announcement,  and  receiving  a  bid  from  the  agents. 

Definition.  [Announcement]  An  announcement  is  a  pair  (P,  Q)  where  P  is 
an  assignment  of  the  tasks  in  T'  where  T'  CT  and  Q  =  {T\  T')  is  the  set  of  tasks 
not  contained  in  that  assignment. 

The  announcement  contains  both  an  assignment  of  items  and  a  set  of  items. 
Based  on  this  announcement  message,  each  agent  generates  a  message,  which  we 
call  a  bid  that  is  specified  as  follows. 

Definition.  [Bid]  After  agent  aj  receives  an  announcement  message  a{H)  = 
(P,  Q),  a  bid  B  =  {Pi,  P2, ...,  P;}  is  a  set  of  pairs  where  P^  =  (T',  v).  The  first 
element  of  each  pair  is  a  set  of  items  T'  =  Pj  U  T”  where  T”  C  Q  and  the  second 
element  is  the  agent’s  value  for  that  set  of  items  v  =  v{aj,  T'). 

At  each  iteration  of  the  CBA  algorithm’s  loop,  the  center  uses  the  bid  history 
to  determine  the  next  announcement.  The  loop  may  terminate  as  a  results  of  a 
decision  made  by  the  center  based  on  the  history  or  because  of  an  external  termi¬ 
nate  signal.  When  the  loop  terminates,  the  center  sends  an  announcement  of  the 
chosen  assignment  to  the  agents. 

Mediation  and  combinatorial  auctions  differ  in  the  algorithm  that  the  center 
uses  to  determine  the  number  of  iterations  allowed,  the  announcement  made  at 
each  iteration,  and  the  bids  that  agents  may  make.  The  differences  are  summarized 
in  Figure  2.2.  In  a  combinatorial  auction,  the  auctioneer’s  announcement  contains 
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CBA 

Announcement 

Bids 

Iterations 

Combinatorial  Auction 

T'  =  0 

\B\  <  2^ 

usually  one 

Mediation 

T  =  T 

\B\  =  1 

many 

Hybrid 

r  gT 

\B\  < 

some 

Figure  2.2:  Overview  of  CBA  algorithms 


no  assignments,  thus  P  =  0  and  Q  =  T.  Agents  may  bid  on  zero  or  more 
subsets  of  tasks  in  T.  In  a  one-shot  combinatorial  auction,  only  one  iteration 
of  the  algorithm  takes  place  before  winners  are  announced.  In  Mediation,  the 
mediator  announces  one  possible  assignment  P.  The  agents  bid  on  the  value  of 
only  that  one  assignment,  and  the  procedure  repeats.  When  time  runs  out,  the 
mediator  announces  the  best  assignment  found  so  far. 

Combinatorial  auctions  give  more  flexibility  to  agents  as  they  bid.  For  exam¬ 
ple,  agents  may  bid  on  a  certain  number  of  subsets  of  items  that  they  value  the 
most,  or  they  may  bid  on  each  subset  of  items.  There  is  an  exponentially  large 
space  (size  2"*)  of  subsets  on  which  an  agent  may  base  its  bid.  After  the  auction¬ 
eer  receives  bids,  it  runs  a  winner  determination  [[Sandholm  1999a]]  algorithm 
to  find  the  best  assignment  based  on  the  bids  it  has  received.  In  Mediation,  the 
flexibility  lies  with  the  mediator  and  not  the  agents.  The  Mediator  determines  the 
order  in  which  assignments  are  announced.  Agents  have  a  simple  rule  for  bid¬ 
ding:  report  the  value  for  the  assignment  encoded  in  the  announcement  that  has 
just  been  received. 

In  mediation  the  flexibility  lies  with  the  center;  compared  to  one-shot  com¬ 
binatorial  auctions,  we  claim  that  Mediation  is  better  suited  to  real  time  envi¬ 
ronments  in  which  there  is  a  limited  amount  of  time  to  find  an  assignment.  In 
combinatorial  auctions,  agents  can  inform  the  center  which  subsets  of  items  are 
most  valuable,  thus  eliminating  time  wasted  on  announcements  that  are  likely  to 
be  fruitless.  CBA  allows  for  a  hybrid  approach  (see  Figure  2.2),  in  which  some 
items  may  be  assigned  while  others  are  open  for  agents  to  bid  on  as  in  a  combina¬ 
torial  auction.  This  may  allow  for  agents  to  achieve  the  benefits  of  both  methods 
where  the  center  can  fix  assignments  to  a  subset  of  items,  and  allow  agents  the 
flexibility  to  bid  on  the  rest  of  the  items. 

Definition  2.2.2  (Assumptions)  In  this  chapter  we  assume  (unless  otherwise  stated) 
that: 

1.  No  inter-agent  interactions:  that  is,  we  restrict  task  interactions  to  those  oc¬ 
curring  between  a  single  agent’s  tasks. 
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Figure  2.3:  Multisensor  tracking. 


2.  Synchronous  and  guaranteed  communication. 

3.  No  faults  to  the  mediator.  We  relax  this  assumption  later. 

4.  Only  pairwise  task  interactions. 

5.  Task  type  information  is  common  knowledge  to  all  agents. 

6.  Agents  have  complete  and  correct  information  to  make  both  planning  decisions 
and  compute  costs. 

7.  The  cost  functions  allow  for  inter-personal  comparisons  and  can  be  summed  to 
determine  social  welfare. 

The  distributed  sensor  challenge  problem  revisited  Figure  2.3  depicts  an  ar¬ 
ray  of  nine  doppler  sensors.  Each  sensor  has  three  sectors  associated  with  it, 
labeled  {1,  2, 3}.  A  sensor  can  turn  on  a  sector  and  take  both  frequency  and  am¬ 
plitude  measurements  to  determine  velocity  and  distance.  A  sensor  can  only  have 
one  sector  on  at  a  time,  however.  The  farther  away  the  target  is  from  the  sensor, 
the  lower  the  quality  of  the  measurement.  At  least  two  sensors  are  necessary  for 
estimating  the  location  of  an  object;  three  or  more  sensors  are  desirable  for  ob¬ 
taining  a  good-quality  estimate.  Tasks  can  interact:  for  example,  sectors  require  a 
2  second  warm-up  time;  therefore,  an  agent  can  benefit  from  tracking  two  targets 
in  sequence  because  of  the  saved  warm  up  time.  Finally,  two  objects  appearing  in 
the  same  sector  and  at  the  same  time  cannot  be  discriminated. 

Tasks  can  appear  dynamically;  the  figure  shows  projected  paths  —  based  on 
initial  localization,  direction  and  velocity  measurements  —  for  two  targets,  T1 
and  T2.  The  problem  is  to  allocate,  in  a  distributed  manner,  a  set  of  sensors  along 
the  paths  of  both  targets.  Each  path  is  discretized  into  a  set  of  space-time  points 
along  the  path  (indicated  in  the  figure  by  small  dark  circles). 
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In  the  case  of  the  challenge  problem,  the  varieties  of  center-based  algorithms 
that  we  have  discussed  each  have  particular  advantages  and  disadvantages.  Se¬ 
quential  auctions  are  attractive  because  they  specify  simple  bidding  rules  for  agents 
Agents  (sensors)  bid  on  their  expected  contribution  to  a  tracking  task,  which 
amounts  to  communicating  the  agent’s  current  distance  and  angle  from  the  task. 
This  suggests  that  auctions  can  be  used  as  an  allocation  mechanism  for  sensors  or 
resources  that  need  not  remain  stationary.  However,  one  disadvantage  of  sequen¬ 
tial  auctions  is  that  they  provide  no  context  —  in  the  form  of  a  list  of  other  tasks 
to  which  an  agent  will  be  assigned  in  later  auctions  —  on  which  to  base  a  bid.  The 
context  can  reflect  important  interactions  that  can  take  place  between  tasks.  With 
sequential  or  parallel  auctions,  agents  are  compelled  to  make  assumptions  about 
the  outcomes  of  other,  related  auctions  when  bidding  -  assumptions  that  may  turn 
out  to  be  incorrect.  For  example,  if  two  tasks  that  are  in  sequence  are  assigned  to 
the  same  sensor,  and  if  the  sensor  has  that  information,  it  knows  it  can  take  advan¬ 
tage  of  the  fact  that  the  sensor’s  sector  need  not  be  warmed  up  in  preparation  for 
the  second  task  and  bid  accordingly. 

Combinatorial  auctions  have  the  advantage  of  allowing  an  agent  to  pick  certain 
bundles  of  tasks  which  might  interact  in  a  favorable  way,  as  in  the  example  of 
the  previous  paragraph.  However,  as  already  mentioned,  they  introduce  a  bid 
generation  problem.  In  addition,  neither  can  handle  task  interactions  that  might 
arise  at  the  group  level.  For  example,  if  two  agents,  oi  and  02  are  bidding  on  ti  and 
t2,  respectively,  as  part  of  a  group  task,  t,  then  oi  might  bid  on  ai  differently  if  it 
knew  that  02  was  planning  to  bid  on  t2.  Consider  a  cooking  scenario  in  which  one 
person  is  tasked  with  preparing  a  particular  dish,  part  of  which  is  to  be  prepared  by 
another  agent.  Knowledge  of  that  other  agent’s  individual  abilities  (for  example, 
that  the  agent  is  particularly  good  at  making  a  good  dressing)  can  influence  the 
agent’s  bid. 

There  are  other  issues  that  arise  which  will  be  discussed  in  this  chapter  having 
to  do  with  task  re-allocation  and  providing  information  to  a  center  that  is  more 
informative  than  simply  a  single  value. 


2.3  Negotiation  in  context 

The  Mediation  algorithm  is  given  in  Figure  2.4.  Its  inputs  are  P  A,  and  an  up¬ 
date  procedure,  an  example  of  which  is  called  Allocation  Improvement  Mediation 
(AIM)  and  is  presented  in  Section  2.3.1.  The  Mediation  algorithm  supports  mak¬ 
ing  group  decisions  in  general  settings.  In  this  chapter  we  explore  the  properties 
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function  Mediation  returns  an  outcome 
inputs:  F,  A,  UpdateProcedure 
let  6^0,  6vai  ^  value(0) 
loop 

c  ^  next  value  generated  by  UpdateProcedure 
broadcast  cto  A 
for  each  ai  in  A 

receive  msg^  from  Oj 
Cvai  ^  VALUE(m5yi,  msg2, msgj 

if  (Cvai  6vai)  then 
b  C,  6val  Cvai 

until  (stop  signal) 

return  b 


Figure  2.4:  Mediation  Algorithm 


of  Mediation  as  it  applies  to  the  more  specific  problem  of  task  assignment. 

In  Mediation,  an  agent  is  selected  to  act  as  mediator  and  implements  a  hill¬ 
climbing  search  in  the  proposal  space,  while  communicating  with  the  distributed 
group  of  agents  through  a  communication  channel.  Use  of  the  channels  is  costly 
in  terms  of  time  and  perhaps  other  resources,  but  it  is  assumed  to  be  lossless. 

The  Mediation  algorithm  proceeds  as  follows.  The  mediator  initializes  a  vari¬ 
able  b  (which  represents  the  best  proposal  found  so  far)  with  0,  along  with  an 
initial  value  denoted  value(0).  Then,  it  calls  an  update  procedure  to  generate 
another  proposal  c  (called  the  current  proposal).  That  proposal  is  broadcast  to  the 
group.  Each  agent  then  responds  with  a  message  that  is  based  on  the  proposal 
that  was  broadcast;  msg^  denotes  the  message  sent  by  agent  i.  The  messages  are 
combined  to  form  a  value,  denoted  yAL\JE{msgi,  msg2^ msg^Y.  If  that  value 
is  preferred  to  the  value  for  the  current  b  (based  on  the  preference  relation  >-),  b  is 
updated  with  the  current  proposal  c. 

The  algorithm  is  anytime:  it  can  be  halted  at  any  time  and  will  return  the 
best  proposal  found  so  far.  The  proposal  stored  in  variable  b  is  returned  as  the 
outcome  when  the  procedure  is  terminated.  Therefore,  Mediation  is  applicable 
even  if  agents  do  not  know  in  advance  how  much  time  they  will  have  to  negotiate. 

^VALUE  may  return  a  real  number  (i.e.,  when  objective  is  social  welfare)  or  a  vector  (i.e.,  when 
objective  is  Pareto  effi  ciency). 
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With  a  choice  of  agent  messages  and  value  function  that  satisfies  the  proper¬ 
ties  in  Theorem  2.1,  Mediation  implements  a  hill-climbing  search  in  the  space  of 
proposals,  with  objective  function  /. 

Theorem  2.1  Let  msg^  denote  the  messages  returned  by  agents  in  G  after  the 
mediator  broadcasts  proposal  p.  Assume  VALUE(msg^)  VALUE(msg®)  if 
and  only  if  f{p,G)  >  f{q,G),  and  VALUE(msg^)  >-  value(0)  if  and  only  if 
f{p,  G)  >  f{il\,  G).  Then  after  each  iteration  of  the  loop  in  the  Mediation  Algo¬ 
rithm,  b  contains  the  best  proposal  (according  to  the  objective  function)  generated 
so  far  by  the  update  procedure  that  is  at  least  as  good  as  0. 

Proof.  Assume  that  the  update  procedure  generates  a  proposal  cq  such  that 
/(co,  G)  >  f{b,  G)  (i.e.,  it  is  the  best  proposal  generated  so  far).  Therefore, 
VALUE(msg^°)  >-  VALUE(msg*)  by  assumption,  and  b  is  updated  with  cq.  f{b,  G) 
will  never  be  lower  than  /(0,  G)  because  for  all  c\  generated  such  that  /(0,  G)  > 
/(ci,  G),  it  is  the  case  that  -■  (value (msg‘^°)  >-  value(0)),  by  assumption,  and 
no  update  occurs. 

The  example  described  in  Section  2.3.2  implements  agent  messages  and  value 
function  that  is  consistent  with  the  assumptions  of  Theorem  2.1.  It  follows  from 
the  theorem  that  if  P  is  finite  and  the  update  procedure  eventually  returns  every 
proposal  in  P,  given  a  sufficient  number  of  iterations  of  the  Mediation  algorithm, 
the  final  value  for  b  will  be  a  proposal  that  maximizes  the  objective  function  /. 

Mediation  does  not  impose  a  particular  search  order  on  the  problem  but  rather 
supports  update  rules  that  can  be  designed  to  search  the  space  of  proposals  in  a 
variety  of  ways.  A  simple,  uninformed  update  procedure  returns  a  proposal  ran¬ 
domly  selected  with  replacement  from  the  set  of  all  proposals.  The  mediation 
algorithm  with  such  an  update  procedure  is  called  Random  Mediation.  The  next 
section  provides  an  example  of  another  update  rule  that  can  be  used  with  Media¬ 
tion. 

2.3.1  Allocation  Improvement 

Allocation  Improvement  defines  an  update  procedure  for  Mediation  that  supports 
task  allocation  domains.  The  procedure  is  sketched  in  Figure  2.5.  The  first  pro¬ 
posal  p  is  chosen  randomly  from  P;  it  provides  a  context,  from  which  subsequent 
proposals  are  generated.  For  example,  it  might  return  ({^2},  {to,  })  which  corre¬ 
sponds  to  the  proposal  where  agent  0  is  assigned  to  task  2,  and  agent  1  is  assigned 
to  tasks  0  and  1 .  The  advantage  of  this  context  is  that  it  is  common  to  all  agents 
and  it  ensures  that  each  task  is  assigned  to  an  agent. 
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Let  p  random  element  of  P  —  {0};  return  p 
for  i  =  1 . . .  |T| 

for  t  ^  every  set  of  tasks  of  size  i 

for  a  ^every  possible  assignment  of  agents  in  A  to  tasks  in  t 
q  ■(—  substitute  a  in  p;  return  q 
if  Qvai  >-  Pvai  in  Mediation,  then  p  q. 

Figure  2.5:  Sketch  of  the  Allocation  Improvement  Update  Procedure 

In  subsequent  iterations,  the  procedure  returns  proposals  that  result  from  mak¬ 
ing  substitutions  in  p  for  i-tuples  of  tasks  where  i  goes  from  1  to  |T|.  Substitutions 
for  each  i-tuple  of  tasks  is  made  sequentially  with  each  permutation  of  agents  in 
lexicographic  order,  while  maintaining  the  allocations  for  the  other  tasks,  p  is 
always  maintained  to  correspond  with  the  best  proposal  seen  so  far  {h  from  the 
mediation  algorithm). 

In  the  example,  the  next  proposal  chosen  will  involve  substituting  agent  0  as 
the  agent  to  perform  task  0  (e.g.,  {{to,  ^2},  {h})  will  be  returned).  Then,  agent  1 
will  be  substituted  as  the  agent  to  perform  task  0,  and  so  on.  This  procedure  is  then 
repeated  for  each  pairoftasks  in  lexicographic  order  (i.e.,  (^0,^2),  (^1,^2))- 

Finally,  every  possible  substitution  (i.e.,  every  element  in  P  —  {0})  is  sequentially 
returned. 

Overhead  The  Allocation  Improvement  procedure  runs  in  constant  space,  is 
guaranteed  to  eventually  return  every  element  of  P  (in  the  last  stage),  but  involves 
an  overhead  cost  in  running  time  because  it  may  return  one  proposal  more  than 
once.  In  a  task  allocation  problem,  the  size  of  P  is  The  Allocation  Im¬ 
provement  procedure  returns  a  total  of  (1  -|-  proposals.  For  example,  with 

20  agents  and  10  tasks,  the  Allocation  Improvement  algorithm  has  an  overhead  of 
62.9%  because  it  returns  62.9%  more  proposals  than  there  are  elements  of  P  be¬ 
fore  it  terminates.  By  increasing  memory  bounds,  the  overhead  can  be  eliminated 
because  each  proposal  can  be  checked  for  redundancy  before  it  is  returned. 

2.3.2  Experimental  Evaluation 

We  performed  a  preliminary  evaluation  of  mediation  in  a  simplified  version  of  the 
Challenge  Problem  in  which  we  map  groups  of  three  sensors  into  a  single  entity,  so 
that  a  proposal  is  of  the  standard  form  discussed  earlier.  An  equivalent  alternative 
is  to  allow  for  more  complex  proposals.  For  example,  ({oi,  03, 04},  {02, 05},  •  •  •) 
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in  which  sets  of  agents  are  allocated  to  a  single  task.  Agents  represent  n  such 
hidden  sensors  of  varying  quality  that  are  randomly  distributed  within  a  100-meter 
by  100-meter  square.  An  object  appears  at  a  point  at  the  top  edge  of  the  square 
and  makes  a  10-second  trip  at  constant  velocity  through  the  square,  exiting  at  the 
bottom  edge.  One  sensor  is  required  to  track  the  object  every  two  seconds  during 
its  journey. 

Each  agent’s  value  function  is  derived  from  encoding  the  knowledge  that  ben¬ 
efits  to  the  group  are  higher  when  each  measurement  is  taken  from  as  close  as 
possible  to  the  object  with  the  highest  quality  sensor.  The  agents  also  incur  costs 
when  taking  the  measurement  because  there  is  assumed  to  be  a  2-second  warm-up 
time,  but  once  warmed  up,  the  sensor  can  take  a  measurement  every  two  seconds. 
The  cost  function  exhibits  task  interaction  as  a  result  of  this  warm  up  time:  costs 
will  be  lower  if  the  same  sensor  takes  consecutive  measurements.  Benefits  and 
costs  are  quantified  by  a  local  utility  function  known  to  each  agent.  Agents  do  not 
know  the  utility  or  even  the  location  of  other  agents. 

The  experiments  were  designed  to  test  the  hypothesis  that  the  Allocation  Im¬ 
provement  procedure  is  an  effective  strategy  for  task  allocation  in  the  tracking 
domain.  AIM  is  expected  to  perform  well  in  the  early  stages  of  Mediation  in  this 
domain  because  it  searches  for  gains  that  come  from  assigning  the  best  agent  to 
each  task  (which  requires  a  relatively  small  number  of  iterations)  before  it  tries  to 
find  improvements  based  on  assignments  for  pairs  of  tasks,  triples  of  tasks,  and  so 
on. 

The  Mediation  algorithm  was  implemented  with  agent  message  and  value 
functions  that  adhere  to  the  assumptions  presented  in  Proposition  2.1.  In  msg^, 
agent  Gi  reports  the  value  associated  with  its  quality  and  expected  distance  from 
the  object,  less  the  warm-up  costs,  for  the  set  of  tasks  to  which  it  has  been  allo¬ 
cated  in  the  latest  proposal.  VALUE(msg^)  is  chosen  to  be  the  sum  of  the  values 
reported  in  each  agent’s  message. 

AIM  was  compared  to  other  instances  of  the  Mediation  algorithm  using  two 
different  types  of  update  rules.  Full  Search  simply  returns  successive  elements  of 
P  as  they  would  be  explored  in  a  depth-first  search.  Random  Mediation  returns  a 
random  element  of  P  at  each  iteration.  In  Section  2.5  we  discuss  ways  to  make 
Mediation’s  search  smarter. 

Results:  Each  negotiation  method  was  run  on  100  problem  instances.  The  first 
experiment  was  run  with  a  group  of  4  agents,  which  implies  |P|  =  1025  (includ¬ 
ing  0).  Eigure  2.6  shows  the  social  welfare  of  proposal  b  after  each  iteration  of  the 
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iteration 


Figure  2.6:  Comparison  of  Mediation  algorithms  in  the  4-agent  sensor  domain 


Mediation  Algorithm,  using  the  three  different  update  rules,  along  with  95%  con¬ 
fidence  intervals.  The  maximum  average  social  welfare  attainable  differed  slightly 
for  the  three  algorithms  since  each  method  was  run  on  100  different  sets  of  prob¬ 
lem  data.  The  results  strengthen  the  hypothesis  that  Allocation  Improvement  is  an 
effective  strategy  in  this  domain,  especially  when  agents  may  search  through  only 
a  small  number  of  proposals  before  they  require  an  outcome  and  must  act. 

After  just  30  iterations,  the  average  social  welfare  of  the  group  is  219.3  using 
AIM  versus  1 81.0  using  Random  Mediation.  The  highest  average  social  wel¬ 
fare  attainable  by  exhaustive  search  is  approximately  228.  AIM  allows  agents  to 
quickly  capture  the  gains  in  increased  social  welfare  that  results  from  assigning 
each  task  to  the  closest  agent  by  delaying  the  search  for  gains  based  on  task  inter¬ 
action.  As  expected,  after  a  large  number  of  iterations,  any  update  procedure  will 
have  exhausted  the  search  space,  and  all  procedures  will  have  found  outcomes  of 
similarly  high  quality. 

The  results  shown  in  Figure  2.7  is  for  groups  of  20  agents.  With  20  agents,  P  is 
of  size  3.2  million,  and  a  full  search  of  the  space  is  prohibitively  costly  for  agents 
that  must  act  quickly.  The  graph  shows  the  average  social  welfare  after  the  first 
1000  iterations  of  each  Mediation  algorithm  across  the  100  problem  instances. 
The  focus  is  on  the  first  1000  problem  instances  because  the  negotiation  must 
be  short  due  to  the  real  time  nature  of  the  problem.  Random  Mediation  and  full 
search  perform  fairly  well  in  the  very  early  stages  of  the  algorithm.  However,  after 
about  65  iterations  or  0.002%  of  the  search  space,  AIM  performs  significantly 
better  than  either  of  the  other  two  methods  (p  <  0.01).  The  superior  performance 
of  AIM  is  more  pronounced  in  the  figure  that  that  shows  the  results  of  this  larger 
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Figure  2.7:  Comparison  of  mediation  algorithms  in  the  20-agent  sensor  domain 


problem  because  only  a  very  small  percentage  of  the  search  space  is  shown.  The 
gains  of  AIM  come  very  early  in  the  negotiation,  as  expected. 

The  Random  and  Full  Search  update  rules  outperform  Allocation  Improve¬ 
ment  at  the  very  early  stages  of  the  search.  This  effect  may  be  due  to  the  sequen¬ 
tial  update  procedure  of  Allocation  Improvement  (i.e.,  all  agents  are  considered 
for  a  given  task  before  the  allocations  to  the  other  tasks  are  searched). 


2.4  Combinatorial  task  allocation 

To  study  the  impact  of  general  task  interaction  on  the  task  allocation  problem 
we  consider  a  set  of  synthetic  multi-agent  domains.  The  simplification  of  the 
challenge  problem  presents  a  specific  type  of  task  interaction,  namely  subadditive 
cost  functions  that  result  from  savings  in  warm-up  time.  A  particular  domain  is 
associated  with  a  certain  interaction  probability,  denoted  by  ip.  This  probability 
quantifies  the  extent  to  which  the  tasks  interact  in  an  agent’s  cost  function. 

A  domain  in  which  ip=0  describes  a  situation  in  which  there  is  no  task  inter¬ 
action.  A  domain  in  which  ip=l  describes  a  scenario  where  tasks  have  arbitrary 
interaction. 

The  following  algorithm  is  used  to  determine  each  agent’s  cost  function.  First, 
construct  an  undirected  graph  with  m  vertices;  each  vertex  corresponds  to  a  task. 
Second,  assign  a  set  of  edges  to  the  graph.  Each  pair  of  vertices  is  connected  by 
an  edge  with  probability  ip.  A  set  of  tasks  interacts  if  and  only  if  the  subgraph 
containing  only  the  vertices  that  correspond  to  the  tasks  is  connected. 
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Figure  2.8:  Task  interaction  graph  and  cost  function  for  Agent  1 


For  the  purposes  of  experimentation,  the  cost  to  an  agent  of  performing  a  set 
of  interacting  tasks  Bj  is  assigned  randomly,  according  to  a  uniform  distribution 
on  real  values  from  0  to  \Bi\.  The  cost  of  performing  a  set  of  tasks  in  which  some 
of  the  tasks  do  not  interact  is  the  sum  of  the  cost  of  performing  each  interacting 
set  of  tasks. 

Figure  2.8  provides  an  example  task  interaction  graph  and  cost  function,  where 
^2  and  ^3  interact.  As  a  result,  execution  of  both  tasks  has  a  different  cost  than 
the  sum  of  costs  of  the  individual  tasks  (i.e.,  1.2  ^  0.5  +  0.9).  In  this  case,  ^2  and 
/is  exhibit  a  positive  interaction  because  the  cost  of  performing  both  is  lower  than 
the  sum  of  the  costs  of  performing  each  task  alone. 

When  a  given  task  is  being  awarded  under  sequential  or  parallel  auctions,  out¬ 
comes  of  other  task  auctions  may  be  unknown.  When  tasks  interact,  this  can  result 
in  task  allocations  that  are  inefficient.  That  is,  there  may  be  another  task  allocation 
that  has  lower  cost. 

Ideally,a  combinatorial  auction  would  solve  this  inefficiency  by  allowing  all 
task  assignments  to  be  made  as  a  package.  Combinatorial  auctions  for  task  allo¬ 
cation  pose  two  new  problems  for  a  group  of  agents  to  consider.  First,  an  agent 
must  decide  which  bundles  to  bid  on,  called  the  bid  generation  problem.  Second, 
based  on  those  bids,  the  auctioneer  must  determine  which  set  of  task  allocations 
corresponds  to  the  minimum  total  cost  for  the  group  of  agents. 

The  second  problem  has  been  studied  extensively  in  the  AI  literature  [Sandholm  1999a, 
inter  alia].  Although  winner  determination  is  NP-complete,  Sandholm  and  Suri 
[Sandholm  and  Suri  2000]  have  developed  an  anytime  search  algorithm  that  uses 
the  set  of  bids  to  focus  the  search  on  fruitful  areas  of  the  search  space  and  can 
provide  approximate  solutions.  They  also  point  out  that  typical  combinatorial 
auctions  do  not  allow  agents  to  express  bids  for  tasks  with  negative  interaction 
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(subadditive  preferences).  The  winner  determination  algorithm  is  significantly 
complicated  by  the  inclusion  of  methods  for  allowing  such  bids. 

Bid  Generation:  The  problem  of  bid  generation  has  not  received  much  atten¬ 
tion,  and  is  not  trivial.  We  define  a  relevant  bid  as  one  that  cannot  be  inferred  as 
an  additive  combination  of  an  agent’s  other  bids.  For  example,  if  an  agent  bids  a 
cost  of  0.3  units  for  {/5i}  and  0.4  units  for  {/52},  an  addition  would  imply  a  bid 
of  0.7  units  for  tasks  {/5i,  /52}.  A  bid  of  say,  0.5  units  for  tasks  {^i,  ^2}  would  be 
a  relevant  bid.  To  achieve  a  minimum  cost  allocation,  is  important  to  generate  all 
relevant  bids  (Theorem  2.2).  However,  it  may  be  infeasible  for  an  agent  to  gen¬ 
erate  all  of  its  relevant  bids  (Theorem  2.3).  Thus,  implementing  a  combinatorial 
auction  for  task  allocation  that  presupposes  bid  generation  may  be  infeasible. 

Theorem  2.2  If  all  relevant  bids  are  not  generated,  then  the  optimal  task  allo¬ 
cation  solution  computed  by  combinatorial  auction  winner  determination  may  be 
suboptimal. 

Proof,  (by  counterexample).  Assume  a  2-agent  group  action  that  decomposes 
into  two  tasks  and  the  cost  functions  are  given  by:  ci({/5i})  =  0.5,  ci({/52})  = 
1.0,  ci({^i,/52})  =  0.6  and  ci({^i})  =  1.0,  ci({/52})  =  0.5,  ci({^i,^2})  =  1-6. 
Assume  the  relevant  bid  by  Agent  1  of  0.6  units  for  {^1,  /52}  is  not  generated.  The 
outcome  of  a  combinatorial  auction  would  be  the  allocation  of  to  Agent  1  and 
(^2  to  Agent  2  for  a  total  cost  of  1.0.  However,  the  optimal  allocation  would  be  to 
assign  both  tasks  to  Agent  1,  for  a  total  cost  of  0.6. 

Theorem  2.3  Ifip=l,  the  number  of  relevant  bids  for  each  agent  is  2"*  —  1. 

Proof.  The  cost  of  a  task  set  that  is  not  fully  connected  can  be  determined 
by  the  sum  of  its  connected  parts.  Therefore,  a  bid  is  relevant  if  and  only  if  its 
tasks  are  fully  connected.  The  task  interaction  graph  is  fully  connected  {ip=  1). 
Therefore,  the  number  of  fully  connected  sets  is  V{Bt),  which  has  size  2"*,  less 
the  bid  on  the  empty  set  which  is  defined  as  0. 

2.4.1  Incremental  Task  Allocation  Improvement  Algorithm 

In  this  section,  we  present  an  anytime  algorithm,  called  Incremental  Task  Allo¬ 
cation  Improvement  (ITAI),  that  does  not  require  a  bid  generation  phase  as  input. 
Agents  incrementally  reveal  their  costs  for  bundles  of  tasks. 
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1.  Generate  an  initial  alloeation  (e.g.,  by  sequential  auetion) 

2.  Initialize  CG,  an  uneonneeted  graph  with  m  vertiees,  eaeh  eorresponding 
to  a  task 

3.  Iteratively  improve  the  alloeation  as  follows: 

•  Add  an  edge  that  eonneets  two  uneonneeted  subgraphs 

•  Optimally  alloeate  the  tasks  that  eorrespond  to  the  edges  in  the  newly 
eonneeted  subgraph 


Figure  2.9:  Anytime  Algorithm  for  Task  Alloeation 


The  algorithm  is  summarized  in  Figure  2.9.  One  way  of  performing  the  initial 
alloeation  step  of  Step  1  quiekly  is  by  sequential  auetion.  As  diseussed  above, 
task  interaetion  may  eause  this  alloeation  to  be  suboptimal. 

The  task  eonneetion  graph  initialized  in  Step  2  direets  the  improvement  phase 
of  Step  3.  At  eaeh  iteration  of  the  improvement  phase,  one  edge  is  added  to 
eonneet  two  uneonneeted  subgraphs.  For  example,  on  the  first  iteration,  an  edge 
is  added  between  any  two  of  the  vertiees.  On  the  seeond  iteration,  an  edge  may  be 
added  between  two  other  vertiees,  or  between  one  other  vertex  and  one  of  the  two 
previously  eonneeted  vertiees  (thus  ereating  a  eonneeted  3-vertex  subgraph). 

On  eaeh  iteration,  an  optimal  alloeation  (e.g.,  by  a  eombinatorial  auetion 
with  optimal  winner  determination)  is  made  for  the  tasks  eorresponding  to  the 
newly  eonneeted  subgraph.  The  proeedure  terminates  when  CG  is  eonneeted 
(i.e.,  adding  an  edge  eannot  eonneet  two  uneonneeted  subgraphs).  The  algorithm 
is  anytime  beeause  it  ean  be  stopped  at  any  point  during  the  improvement  phase 
and  ean  return  the  lowest  eost  alloeation  attained  so  far. 

To  generate  the  initial  allocation,  an  agent  need  reveal  only  m  costs,  one  for 
each  initial  task  allocation.  In  the  improvement  phase,  even  if  a  combinatorial 
auction  algorithm  is  used,  an  agent  is  initially  faced  with  a  much  simpler  bid  gen¬ 
eration  problem,  because  the  algorithm  is  run  over  a  small  number  of  tasks.  If 
an  exhaustive  enumeration  of  task  allocations  is  used  instead  of  a  combinatorial 
auction  in  that  phase,  the  bid  generation  problem  is  replaced  by  incremental  reve¬ 
lation  of  costs  for  sets  of  tasks. 

Theorem  2.4  The  algorithm  is  guaranteed  to  find  the  optimal  task  allocation. 
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Proof.  In  the  final  iteration,  CG  is  eonneeted.  If  the  algorithm  then  optimally 
alloeates  all  m  tasks  eorresponding  to  edges  in  CG,  then  the  alloeation  will  be 
optimal. 

Time  Complexity:  Similar  to  the  general  iterative  deepening  search  algorithm, 
ITAI  incrementally  expands  the  scope  of  its  search  for  the  optimal  task  alloca¬ 
tion.  The  time  spent  on  task  allocation  is  the  sum  of  the  time  spent  generating  the 
initial  allocation,  plus  the  time  spent  improving  the  allocation.  As  Theorem  2.5 
illustrates,  the  complexity  of  the  algorithm  is  the  same  as  an  algorithm  that  per¬ 
formed  optimal  allocation  of  all  tasks  in  a  single  step. 

Theorem  2.5  Assuming  an  iteration  of  the  improvement  phase  that  allocates  i 
tasks  takes  0(n  •  2*)  time,  the  running  time  of  the  improvement  phase  is  0(n  •  2"*). 

Proof.  The  maximum  number  of  improvement  steps  results  if  a  single  vertex 
is  connected  to  the  subgraph  at  each  iteration  of  Step  3.  In  this  case,  there  are 
i  =  m  —  1  steps,  with  the  numbers  of  connected  vertices  running  from  2  to  m. 
The  total  running  time  of  the  improvement  phase  is: 

m 

0{Y^  n-T)  =  0{2n{2'^  -  2)) 

i=2 


which  is  0(n  •  2"*). 

2.4.2  Empirical  Evaluation 

The  generalized  task  interaction  domain  described  earlier  is  used  to  generate  re¬ 
sults  of  ITAI.  Figure  2.10  illustrates  the  movement  of  the  total  cost  attained  by  the 
successive  sequence  of  allocations  generated  by  the  algorithm.  The  experiment 
was  run  in  a  2-agent,  10-task  domain.  Therefore,  there  were  2^°  =  1024  total 
possible  task  allocations. 

The  three  separate  experiments  correspond  to  domains  with  ip=0,  0.5,  and  1 
and  results  shown  are  averages  over  20  instantiations  of  each  domain.  Due  to 
slight  variations  in  assigned  cost  functions  in  the  three  cases,  the  costs  have  been 
normalized,  with  the  cost  of  the  initial  allocation  corresponding  to  a  value  of  1  on 
the  y-axis.  The  initial  allocation  was  made  by  sequential  auction,  with  no  look¬ 
ahead  (agents  computed  their  bid  for  a  task  assuming  they  would  execute  only 
those  tasks  already  allocated  to  them).  The  optimal  allocation  algorithm  used 
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1.1 


Improvement  Steps 


Figure  2.10:  Improvement  of  task  alloeation  eost  under  ITAI. 


in  the  improvement  phase  iterated  over  all  possible  task  allocations,  collecting  a 
bid  for  each  from  each  agent,  and  updating  the  allocation  with  that  which  has  the 
minimum  sum  of  bids.  The  generation  and  evaluation  of  bids  for  each  hypothetical 
allocation  is  referred  to  as  an  improvement  step. 

Since  the  choice  of  the  edge  to  add  in  Step  3  is  nondeterministic,  the  number 
of  steps  required  for  the  improvement  phase  of  the  algorithm  to  complete  varied. 
The  mean  was  1765;  the  maximum  was  2035,  at  which  point  the  allocation  was 
guaranteed  to  be  optimal  (by  Theorem  2.4). 

As  expected,  for  ip=0,  the  improvement  phase  did  not  lower  the  cost  of  the 
allocation  because  all  tasks  were  independent  for  both  agents.  When  task  interac¬ 
tion  was  introduced,  the  improvement  phase  of  the  algorithm  performed  well.  For 
ip=0.5,  after  196  steps,  the  improvement  phase  had  generated  an  average  of  25% 
in  cost  savings,  which  corresponds  to  over  64%  of  the  potential  total  cost  savings 
of  61%.  For  ip=l,  after  only  93  steps,  the  improvement  phase  had  generated  an 
average  of  25%  in  cost  savings,  which  corresponds  to  over  53%  of  the  potential 
total  cost  savings  of  53%. 
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2.5  Dynamic  negotiation 

The  development  of  negotiation  protocols  to  support  the  distributed  allocation  of 
tasks  or  resources  within  a  multiagent  system  has  assumed  that  the  set  of  issues 
over  which  agents  negotiate  are  defined  in  advance.  Most  realistic  problem  do¬ 
mains,  however,  are  dynamic:  while  agents  are  negotiating  over  the  distribution 
of  a  set  of  tasks,  for  example,  new  tasks  or  resources  might  appear  or  dissapear. 
Existing  protocols  either  require  that  the  allocation  of  a  new  task  be  postponed 
until  the  current  negotiation  is  complete  or  require  that  the  negotiation  be  inter¬ 
rupted  and  re-started  in  the  context  of  the  augmented  set  of  tasks.  Adopting  the 
first  option  neglects  the  potential  for  exploiting  possible  positive  interactions  be¬ 
tween  the  old  set  of  tasks  and  the  new  task,  since  they  are  allocated  separately.  In 
the  second  option,  all  of  the  work  performed  in  the  first  negotiation  is  lost  and, 
in  fact,  there  can  be  no  guarantee  that  the  process  will  ever  converge  since  new 
tasks  could  continue  to  appear.  Ideally,  it  should  be  possible  to  adapt  the  ongoing 
negotiated  allocation  to  the  new  task.  Similarly,  agents  not  currently  involved  in 
a  negotiation  might  become  free  to  offer  their  services  on  the  task  being  negoti¬ 
ated.  If  an  agent  becomes  disabled,  it  should  also  be  possible  to  repair  the  current 
allocation  as  reflected  by  the  current  state  of  the  negotiation  rather  than  restart  the 
negotiation  from  scratch. 

The  problem  of  dynamic  negotiation  is  reminiscent  of  that  faced  by  conven¬ 
tional  single-agent  planning  systems  in  the  1980’s:  then  it  was  assumed  that  com¬ 
plete  executable  plans  could  be  generated  before  execution  commenced;  the  intro¬ 
duction  of  new  goals  during  the  planning  process  required  a  complete  replanning. 
It  was  discovered  that  such  a  view  was  unrealistic  for  any  but  simple  toy  domains. 

In  this  section,  we  examine  negotiation  in  team-based  settings  where  tasks  are 
not  necessarily  limited  to  primitive  actions  but  are  instead  richly  described  entities 
[[Tambe  1997,  Rauenbusch  2003,  Ortiz  and  Hsu  2002]].  As  before,  our  concern 
is  in  domains  where  full  exchange  of  information  among  agents  is  not  feasible; 
hence,  mediators  will  have  to  operate  in  the  face  of  incomplete  information.  In 
addition,  we  assume  that  the  agents  participating  in  such  teams  will  be  concerned 
with  the  means  for  executing  only  their  individual  tasks;  hence,  the  mediator  will 
normally  not  concern  itself  over  local  planning  decisions  of  any  one  agent.  The 
main  idea  behind  our  solution  involves  the  exchange  of  local  information  in  the 
form  of  a  description  of  positive  and  negative  interactions  between  task  types.  By 
focusing  on  task  types,  mediators  can  use  prior  bids  in  a  predictive  fashion  to 
prune  the  search  space  of  possible  future  proposals. 

Examples  of  domains  for  which  the  approach  we  describe  in  this  paper  are 


25 


applicable  include  distributed  transportation  scheduling  and  distributed  sensor 
scheduling.  For  generality  we  present  our  results  within  the  first  domain:  the  set 
of  tasks  might  include  delivery  and  pickup  tasks,  each  with  parameters  for  source, 
destination,  time,  object  to  deliver,  truck  for  delivery,  and  path  to  destination.  The 
chosen  path  and  the  identity  of  the  truck  correspond  to  planning  decisions  made 
at  the  local  agent  level.  Tasks  in  the  sensor  domain  map  almost  directly  to  tasks  in 
the  transportation  domain.  In  the  sensor  domain,  multiple  sensors  are  necessary 
for  estimating  the  location  of  an  object.  As  we  have  already  discussed,  tasks  can 
interact:  for  example,  if  sectors  require  a  warm-up  time,  an  agent  can  benefit  from 
tracking  two  targets  in  sequence  because  of  the  saved  warm  up  time.  Correspond¬ 
ing  to  a  delivery  {Agents  Destination,  Object, Truck,  Time)  task  description  in 
the  transportation  domain  might  be  a  task  in  the  sensor  domain  represented  as 
track{Sensor,  UntilTime,  Object,  Sector,  Time). 

Each  executable  task  description  has  one  or  more  associated  types  taken  from 
the  set  Types.  We  assume  a  function  type  :  T  — where  T  is  the 
set  of  tasks.  For  example,  an  executable  task  description  might  be  of  the  form 
t  =  deliver {Agent2b,  Propane,  Warehouse),  where  type  information  is  speci- 
ficed  by  type{t)  =  {delivery JoAVarehouse,delivery-of -Propane}.  Bidders 
use  such  information  to  communicate  task  interactions  to  the  mediator  (see  next 
section).  We  add  the  notion  of  agent  capacity  to  the  problem  statement:  let 
c  :  A  X  i?  — AC,  an  agent’s  current  capacity,  e.g.,  c{a^,  t)  =  3  means  the  agent  05 
can  take  on  3  more  tasks. 

In  this  section,  we  imagine  that  each  agent’s  state  (from  the  set  S  of  states)  is 
structured  in  some  way.  For  example,  the  agent  might  store  information  regarding 
its  current  location,  past  proposals  that  it  has  received,  a  cost/utility  function,  and 
its  current  capacity.  The  mediator,  in  turn,  has  only  partial  knowledge  of  any  in¬ 
dividual  agent;  the  mediator  records  task  preference  information  of  non-mediator 
agents  in  what  we  call  an  interaction  table,  X,  and  capacity  status  information  sup¬ 
plied  to  it  by  way  of  rich  bids  by  a  bidder.  The  task  interaction  table  is  built  up  by 
the  mediator  for  each  agent  and  records  interactions  between  task  types.  For  ex¬ 
ample,  agent  a,  might  report  the  following  interactions  {ti  +  t4,ti\  [fa,  fi  —  fs ,  ^4  1X1 
te},  where  tj  G  Types,  indicating  that  types  ti  and  t4  interact  positively,  ti  and  fa 
are  independent  (that  is,  their  costs  are  additive),  ti  and  t^  interact  negatively  and 
t4  and  te  conflict  (cannot  be  executed  together).  In  a  delivery  domain,  ^4  might  cor¬ 
respond  to  a  delivery  of  flammable  goods  task  type  and  fg  to  a  delivery 
of  explosive  materials  task  type.  The  stated  task  information  constrains 
task  assignments  involving  delivery  of  both  types  of  materials.  The  alternative 
involving  the  comparison  of  fully  specific  executable  tasks  would  be  less  useful. 
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For  example,  if  the  aetual  task  alloeations  were:  delivery  of  Propane- j  35 
to  Logan  at  2pm  and  delivery  of  fireworks  to  Logan  at  2:30pm,  re¬ 
porting  a  negative  interaetion  between  the  two  would  have  little  value  in  helping 
to  prune  the  spaee  of  future  proposals,  because  of  their  specificity. 

The  capacity  information  might  indicate  that  a  particular  agent  (e.g.,  truck) 
can  fit  one  more  crate.  However,  the  cost  for  delivery  of  that  crate  might  depend 
on  the  type  of  task  with  which  it  is  associated  (e.g.,  the  destination,  the  crate’s 
weight,  or  path). 

Definition  2.5.1  (A  Dynamic  Negotiation)  Let  t  A-  a  refer  to  the  set  of  tasks,  t, 
assigned  to  agent  a  in  proposal  Pj.  We  augment  the  definition  of  a  task  allocation 
system,  M,  to  include  capacity  status:  M  =  {A,  T,  S,  u,  c,  V).  A  dynamic  medi¬ 
ated  negotiation,  N,  is  a  sequence  of  messages:  (Pi;  P2;  •  •  • ;  . . . ;  P^;  P^+i) 

such  that  each  Pj  G  P  for  i  <  j  and  each  Bi  =  {bidi,  bid2,  •  •  • ,  bidm}  cor¬ 
responds  to  a  set  of  rich  bids  from  each  agent  in  Pi.  Each  pair  of  indices  for 
P  and  B  correspond  to  a  round  in  the  mediation.  Each  Aj  represents  a  pair, 
Aj  =  {TnewT^new)  of  new  tasks  and  agents;  more  than  one  A  can  appear  in  any 
sequence.  The  social  welfare,  W,  for  some  proposal.  Pi,  is  defined  as: 

W (Pi)  =  ^  rt(oj,  f  A  a) 

aE.Pi 


2.5.1  Rich  bids 

Rather  than  reducing  a  bid  for  some  task  to  a  single  value.  Dynamic  Mediation 
makes  use  of  a  richer  bid  format  which  allows  a  bidder  to  compactly  exchange 
relevant  information  about  its  local  state  to  the  mediator.  The  mediator  can  then 
use  that  information  during  its  search.  We  refer  to  the  set  of  resources  (agents)  and 
tasks  collectively  as  the  negotiation  space.  The  negotiation  space  might  change 
because  of  either  a  negotiation  event  (the  mediator  considers  a  new  resource)  or  a 
domain  event  (a  new  task  appears). 

The  format  of  a  bid  for  a*  at  time  t  for  proposal  P  is: 


where  the  first  argument  represents  the  agent’s  bid  for  each  task  in  the  proposal, 
I  represents  a  list  of  relevant  task  interactions  from  afs  point  of  view  for  a  subset 
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of  the  tasks  covered  in  P,  and  c{ai^t)  reflects  agent  a/s  current  capacity.  This 
is  by  no  means  the  only  alternative  possible.  One  could  also  consider  providing 
counterfactual  values  for  each  task  that  is  reported  to  interact  in  a  negative  way 
with  another  task.  Another  alternative  is  to  communicate  a  qualitative  preference 
to  some  other  task  [[Doyle  et  al  1991]].  Such  information,  however,  does  not  cap¬ 
ture  the  reason  for  the  preference:  this  is  captured  compactly  in  task  interaction 
statements.  Since  we  were  most  interested  in  minimizing  the  amount  of  informa¬ 
tion  shared  with  the  mediator,  we  chose  the  format  shown  above. 

Task  interaction  semantics  and  bid  generation 

A  positive  interaction  between  tasks,  ti  and  t2  reflects  task  complementarity  in 
terms  of  an  underlying  cost  function,  u,  with  the  a  superadditive  property  for 
some  agent  a,  u{a,  {ti,t2})  <  u{a,{ti})  -f  u{a,{t2}),  for  example,  common 
delivery  destinations.  A  negative  interaction  corresponds  to  task  substitutability 
(subadditivity),  rt(a,  {^1,^2})  >  u{a,  {^i})  +  {^2})>  for  example,  a  delivery 

requiring  two  separate  trips. 

The  framework  we  have  described  introduces  the  following  bid  generation 
problem:  how  should  the  agent  decide  on  which  potential  interactions  to  commu¬ 
nicate  to  the  mediator?  The  difficulty  is  that  a  statement  by  agent  of  t\-\-t2  might 
hold  under  context,  C  (i.e.,  under  some  collection  of  other  task  assignments  for  Oj 
under  the  current  proposal),  while  under  another  context,  C",  the  interaction  t\—t2 
might  instead  hold.  We  are  currently  exploring  a  general  approach  to  handle  this 
problem  by  making  use  of  a  task  abstraction  hierarchy  and  the  observation  that  a 
task  t,  in  context,  C,  can  be  captured  as  a  separate  task  type  itself  [[Ortiz  1999]]. 
The  abstraction  hierarchy  represents  individual  tasks  and  tasks  performed  while 
performing  other  tasks.  The  bidder  picks  the  most  abstract  task  descriptions  which 
stand  in  the  indicated  positive  or  negative  interaction.  In  this  way,  the  task  interac¬ 
tion  table  is  always  consistent.  In  our  actual  algorithm,  we  use  less  detailed  type 
information  and  use  task  interactions  as  heuristics  to  increase  the  probability  of 
finding  a  solution. 

Dynamic  mediation  algorithm 

Figure  2. 1 1  presents  the  dynamic  mediation  algorithm.  Recall  that  the  algorithm 
implements  an  iterative  hill  climbing  search  through  the  proposal  space,  keeping 
track  of  the  best  proposal,  b,  found  so  far.  At  each  step  the  mediator  selects  and 
communicates  the  current  proposal,  c,  to  the  agents  in  the  group.  This  section  ex- 
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function  DynamicMediation  returns  an  outcome 

inputs  a  set  of  tasks  T,  set  of  agents  A 

let  6^0,  byai  ^  value(0) 

let  Interaetion  Table  it  ^ 

loop 

c  getNextProposal(T,  A,  it) 
broadeast  cto  A 
for  eaeh  a*  in  A 

reeeive  bid*  from  Oj 
store  interaetions  in  bid*  in  it 
Cvai  ^  VALUE(msyi,  msg2,  •••,  msgj 
if  (Cvai  >~  ftvai)  then 
b  i  C,  6val  ^  Cyal 

until  (stop  signal) 

return  b 


Figure  2.11:  Dynamie  Mediation  Algorithm 


tends  the  algorithm  to  support  dynamies  by  adding  maintenanee  of  an  interaetion 
table  to  leverage  information  reeeived  from  bidders  and  to  foeus  the  seareh. 

Eaeh  agent  then  responds  with  a  bid  that  is  based  on  the  proposal  that  was 
broadeast;  bidi  denotes  the  bid  sent  by  agent  i.  The  information  about  interaetion 
among  task  types  that  is  provided  in  the  bid  is  stored  in  the  Interaction  Table.  The 
mediator  uses  the  information  eontained  in  the  interaetion  table  to  foeus  its  seareh. 
In  eaeh  round  of  mediation,  the  interaction  table  is  used  to  eonstruet  a  probability 
distribution  over  the  set  of  agents  for  eaeh  task  in  the  proposal.  The  mediator 
foeuses  the  seareh  by  using  the  interaetion  information  previously  provided  to  it 

function  getNextProposal  returns  a  proposal 
inputs  set  of  tasks  T,  set  of  agents  A,  Interaetion  Table  it 
let  P  ^  0 
for  eaeh  task  f  in  T 

pd  ^  GETPROBABILITYDlSTRIBUTION(f,  P,  A,  it) 
at  ^  agent  ehosen  randomly  aeeording  to  pd 
P  P  U  {assign(f,  at)} 

return  P 


Figure  2.12:  Update  procedure  for  Dynamic  Mediation. 
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function  getProbabilityDistribution 

inputs  task  t,  proposal  P,  set  of  agents  A,  InteractionTable  it 

returns  a  probability  distribution  over  A 

for  each  agent  o  in  ^ 

for  each  task  uin  {u  \  assign{u,  a)  G  P} 

/  t—  /  U  getInteraction(f,  u,  it) 
if  G  /,  i  G  {M}  then  scorca  0.0 
else  if  Vi  G  /,  i  G  {+,  ?}  then  scorca  ^  2.0 
else  if  Vi  G  /,  i  G  {— I,  ?}  then  scorca  ^  0.5 
else  scorca  ^1.0 
for  each  agent  am  A 

pd{a)  t-  scoreA/  J2beA  scoreb 
return  pd 

by  some  agent  to  adjust  the  likelihood  that  the  agent  will  be  assigned  a  given  task. 
For  example,  if  the  mediator  expects  a  positive  interaction  between  two  tasks  for 
some  agent,  those  two  tasks  should  be  more  likely  to  be  assigned  to  that  agent. 
If  the  mediator  knows  that  two  tasks  conflict  for  a  given  agent,  that  agent  should 
not  be  assigned  to  both  tasks.  An  example  of  an  algorithm  for  the  generation  of 
probability  distributions  is  given  in  Figure  2.12.  This  algorithm  implements  a 
stochastic,  heuristic-based  search  that  weights  task  assignments  according  to  the 
information  contained  in  the  variable,  it. 

The  mediation  algorithm  supports  dynamic  adjustments  to  the  set  of  tasks  to 
be  negotiated.  Since  the  interaction  table  stores  information  about  task  types, 
that  information  may  be  used  to  determine  probability  information  for  new  tasks 
that  arrive  after  a  negotiation  begins.  The  mediation  algorithm  has  the  following 
anytime  property  which  makes  it  applicable  even  if  agents  do  not  know  in  advance 
how  much  time  they  will  have  to  negotiate. 

Note  that  without  the  capacity  parameter,  as  the  system  tries  to  respond  to  new 
tasks,  it  can  eventually  become  saturated  and  thrash.  By  using  capacity  informa¬ 
tion  the  mediator  can  signal  that  execution  should  begin  while  certain  new  tasks 
are  postponed.  This  can  be  accomplished  by  simply  adding  a  line  to  the  getProb¬ 
abilityDistribution  function  that  assigns  0  probability  for  agents  over  capacity. 

We  assume  that  the  mediator  stores  all  of  the  interaction  information  from 
each  bidder  in  a  separate  table.  The  memory  required  for  this  table  grows  linearly 
in  the  number  of  interactions  reported.  In  the  algorithm,  the  search  space  is  either 
expanded  by  adding  a  new  task  or  a  new  resource/agent  or  narrowed  (deleting  a 
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function  getInteraction 
inputs  task  t,  task  u,  InteractionTable  it 
returns  one  of  ,  N,  ?} 
interaction  ^  ? 
for  each  tt  G  type{t) 
for  each  ut  G  type{u) 

if  ((f,  rt,  N)  G  it)  then  return  N 
if  (interaction=none) 

if  ((ft,  rtt,  +)  G  it)  then  interactions  + 
if  ((ft,  rtt,  — )  G  it)  then  interactions — 
if  (interaction=+) 

if  ((ft,  rtt,  — )  G  it)  then  return  ? 
if  (interaction=-) 

if  ((ft,  rtt,  +)  G  it)  then  return  ? 
return  interaction 

Figure  2.13:  Example  Construction  of  Probability  Distribution 


task  or  resource).  The  mediator  can  combine  steps  (for  example,  removing  a  task 
while  adding  a  resource). 

Task  contention,  team  composition  and  fault  tolerance 

The  notion  of  task  contention  in  the  challenge  problem  is  essentially  that  of  a 
task  conflict,  as  described  above.  Consider  Figure  2.3.  We  will  use  the  notation 
Sn/s  to  refer  to  sector  s  of  sensor  n  and  the  notation  f^  to  refer  to  the  point  p 
on  projected  path  r  and  f  to  refer  collectively  to  the  tasks  on  path  r.  Suppose  a 
negotiation  involving  task  f^  has  been  ongoing  for  n  rounds  with  the  current  best 
proposal,  PI  =  {S2/1,  S2/2,  S5/3,  S7/S).  Target  f^  then  appears  and  is  pro¬ 
jected  to  follow  the  path  shown.  Suppose  that  the  best  allocation  for  f^,  assuming 
that  there  were  no  conflicts,  would  be  P 2  =  {S7/1,  S8/3,  S5/1,  S3/2).  However, 
if  tasks  fg  and  f|  occur  at  the  same  time,  a  conflict  involving  sensor  Sb  will  result. 
We  refer  to  this  as  an  instance  of  task  contention.  There  are  two  cases  to  deal  with. 
In  the  first  case,  if  we  assume  that  the  same  mediator  is  coordinating  the  negotia¬ 
tion  for  both  targets,  then  the  mediator  will  be  aware  of  the  commitment  of  S'S  to 
task  fg.  If  it  is  aware  of  the  contention  between  tasks  fg  and  f|,  then  it  can  adjust 
its  proposal  to  the  group  accordingly  to  avoid  a  conflict.  However,  if  some  other 
agent  is  acting  as  mediator  for  f^,  or  is  not  aware  of  the  task  contention,  then  that 
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mediator  might  very  well  propose  P2,  the  allocation  resulting  in  task  contention. 
In  very  large  domains,  with  hundreds  or  even  thousands  of  nodes,  it  is  infeasible  to 
have  a  single  agent  that  will  centrally  coordinate  all  negotiation.  Therefore,  agent 
Sb  might  respond  to  P2  with  the  bid,  (— oo,  ^3  —  f|)  indicating  the  conflict  and  its 
source.  To  resolve  this  conflict,  the  mediator  may  add  an  option  (i.e.,  enlarge  the 
group  of  agents)  and  propose  (-ST/l,  ^S/S,  -SG/S,  S3/2)?  The  key  point  is  that 
the  contention  cannot  be  resolved  on  the  basis  of  local  information  alone  (such  as 
a  bid  involving  a  single  value);  hence  the  the  richer  bid  format  we  have  proposed. 

Problems  having  to  do  with  forming  teams  with  the  proper  mix  of  sensors  are 
also  dealt  with  at  the  mediator  level,  using  rich  bids.  For  example,  if  3  agents  are 
associated  with  a  task  in  a  proposal  and  one  of  the  agents,  o,,  bids  a  low  value 
(such  as  a  short  duration  measurement)  because  it  is  committed  to  serving  on 
another  tracking  team  consisting  of  5  agents,  during  an  overlapping  interval,  then 
agent  a*  can  relay  that  information  to  the  mediator  (this  assumes  there  are  multiple 
mediators)  who  can  then  make  the  determination  to  perhaps  have  the  agent  drop 
the  original  commitment,  since  the  new  task  is  of  higher  priority  (the  marginal 
gain  from  the  agent’s  contribution  on  the  committed  task  is  lower  than  that  on  the 
new  task  involving  a  smaller  team). 

In  response  to  a  system  fault,  such  as  the  loss  of  an  agent  (some  A),  the  medi¬ 
ator  will  look  for  substitute  agents.  The  Mediation  algorithm  has  also  been  made 
tolerant  to  faults  to  the  mediator  agent;  this  extension  is  not  shown  in  the  figure, 
however.  The  solution  is  simple:  instead  of  communicating  only  with  the  medi¬ 
ator,  agents  broadcast  their  bids  to  the  entire  group.  If  the  mediator  is  disabled, 
a  new  one  can  be  chosen;  it  will  have  a  current  record  of  the  negotiation  and  can 
proceed  where  the  disabled  mediator  left  off. 

2.5.2  Experimental  results  and  evaluation 

The  value  of  rich  bids  and  the  interaction  table  as  well  as  the  effectiveness  of 
mediation  in  dynamic  environments  were  evaluated  with  simulations.  We  were 
primarily  concerned  with  studying  the  effectiveness  and  feasibility  of  commu¬ 
nicating  rich  bids,  and  the  incorporation  of  task  interaction  information  into  the 
interaction  table.  The  domain  that  we  developed  had  instances  of  the  three  types 

^In  the  actual  multisensor  domain  we  have  described,  there  are  also  instances  in  which  the 
contention  can  only  be  resolved  if  a  node  switches  from  one  sector  to  another,  halfway  between 
two  points  on  the  projected  target  path,  thereby  lowering  the  quality  of  the  measurement  for  both 
targets.  This  complication  can  be  dealt  with  by  partitioning  the  projected  track  in  a  more  fi  ne 
grained  way. 
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of  interactions  that  we  defined  in  our  model:  positive,  negative,  and  conflict.  It 
is  a  simplified  delivery  problem  in  which  agents  are  assigned  tasks  of  delivering 
goods  to  their  destinations.  Each  task  was  defined  by  two  attributes:  type  of  good 
and  destination.  Goods  can  be  one  of  three  types:  propane,  electrical,  or 
food.  There  were  two  possible  destinations  for  each  task. 

The  cost  incurred  by  an  agent  in  executing  a  set  of  tasks  was  determined  in 
advance  and  remained  constant  throughout  the  experiments.  First,  the  cost  of  de¬ 
livering  one  package  to  a  single  destination  was  chosen  randomly  from  a  uniform 
distribution  from  0  to  20.  The  cost  of  performing  each  subsequent  task  was  cho¬ 
sen  from  a  uniform  distribution  from  0  to  the  cost  of  performing  the  previous 
task.  Thus,  as  more  tasks  were  given  to  an  agent,  the  additional  cost  of  delivery 
decreased  monotonically.  Second,  the  cost  to  an  agent  that  is  responsible  for  de¬ 
livering  packages  to  both  locations  is  double  its  cost  for  delivering  to  one  location. 
Third,  no  agent  can  be  responsible  for  delivering  both  propane  and  electical 
goods. 

Agents’  rich  bids  report  the  interaction  among  tasks  that  an  agent  is  assigned. 
In  our  experiments,  an  agent  reports  all  interaction  information  for  each  pair  of 
tasks  in  the  set  of  all  tasks  to  which  it  is  assigned  at  each  iteration.  Conflict  in¬ 
teraction  is  reported  for  each  pair  of  propane  and  electrical  tasks.  Positive 
interaction  is  reported  for  each  pair  of  tasks  delivered  to  the  same  location.  Nega¬ 
tive  interaction  is  reported  for  each  pair  of  tasks  delivered  to  different  locations. 

Figure  2.13  illustrates  the  probability  distribution  generation  procedure  used 
for  the  experiments.  Tasks  that  are  known  to  conflict  are  never  given  to  the  same 
agent.  Tasks  that  have  positive  interaction  are  more  likely  to  be  given  to  the  same 
agent;  tasks  that  have  negative  interaction  are  less  likely. 

The  getProbabilityDistribution  function  creates  a  probability  distribution  over 
the  agents  to  assign  to  a  given  task.  The  probability  of  assigning  an  agent  to 
any  given  task  is  affected  by  the  tasks  to  which  it  is  already  assigned  and  by  the 
information  stored  by  the  mediator  in  the  interaction  table.  The  function  uses  a 
system  for  scoring  agents’  likelihood  of  assignment  to  the  task.  If  the  task  to  be 
assigned  is  known  to  conflict  with  a  task  type  that  is  already  assigned  to  an  agent, 
assignment  of  that  agent  to  the  task  is  given  a  score  of  zero.  Agents  assigned  to 
tasks  with  types  that  interact  positively  with  the  current  task  are  given  a  score  of 
2.  Agents  assigned  to  tasks  with  types  that  interact  negatively  are  given  a  score 
of  0.5.  Other  agents  are  given  a  score  of  1.  The  scores  are  normalized  to  create  a 
probability  distribution  over  the  agents. 

Since  each  task  can  be  associated  with  more  than  one  task  type  (e.g.,  deliv¬ 
ery  location  and  type  of  good),  there  may  be  inconsistent  interaction  information 
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Figure  2.14:  Results  for  static  mediation. 


stored  in  the  interaction  table  for  interaction  between  tasks.  In  this  experiment, 
when  inconsistent  information  is  present  for  a  pair  of  tasks,  the  function  proceeds 
as  though  no  useful  interaction  information  exists  in  the  getinteraction  function. 
Another  source  of  inconsistency  arises  because  an  agent  may  have  been  assigned 
to  several  tasks,  which  interact  in  different  ways  with  the  current  task.  Again, 
the  mediator  in  these  experiments  assumes  that  no  useful  interaction  information 
exists. 

The  goal  of  the  first  simulation  was  to  evaluate  the  effectiveness  of  using  the 
task  interaction  table  for  making  task  assignments.  The  experimental  data  was 
generated  with  the  delivery  domain  described  above  using  four  agents  and  ten 
delivery  tasks.  Thus,  the  search  space  of  all  possible  proposals  was  of  size  4^°,  or 
approximately  10®. 

Figure  2.14  illustrates  the  average  lowest  cost  attained  in  15  trials  by  each  of 
the  first  1000  rounds  of  mediation  using  three  separate  update  procedures:  ran¬ 
dom,  conflict  and  interaction.  Random  ignores  the  interaction  table  and  generates 
a  random  sequence  of  proposals.  Conflict  uses  the  interaction  table  to  eliminate 
assignment  of  tasks  to  agents  where  known  conflicts  exist.  Interaction  is  the  up¬ 
date  procedure  described  above  that  makes  a  probability  distribution  based  on 
reported  positive,  negative  and  conflict  interaction. 

The  results  verify  our  hypothesis  that  interaction  information  and  rich  bids 
have  significant  benefit.  For  instance,  after  just  200  rounds,  the  average  low¬ 
est  cost  attained  using  the  interaction  table  is  69.2  while  the  average  lowest  cost 
without  using  the  interaction  table  is  93.9.  Interaction  produces  a  cost  savings  of 
approximately  26%.  Some  of  this  benefit  is  realized  from  eliminating  conflict- 
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Figure  2.15:  Results  for  dynamic  mediation. 
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ing  proposals  (9%);  the  remainder  comes  from  the  probability  distribution  that 
weighs  positive  and  negative  interactions  when  assigning  agents  to  tasks  in  a  pro¬ 
posal.  The  results  confirm  our  intuition  that  in  certain  domains  rich  bids  and  the 
interaction  table  benefit  task  assignment. 

The  goal  of  the  second  set  of  simulations  was  to  validate  the  use  of  rich  bids 
and  the  interaction  table  dynamic  changes  occur  in  the  environment.  The  expec¬ 
tation  was  that  since  interaction  information  was  stored  for  task  types,  using  that 
information  when  a  new  task  appeared  would  allow  better  task  assignments  to  be 
made. 

Figure  2.5.2  illustrates  the  average  lowest  cost  assignment  found  before  and 
after  two  new  tasks  are  introduced  at  round  1000  of  the  mediation.  Three  differ¬ 
ent  update  procedures  were  compared:  random  update,  interaction,  and  random 
restart.  Interaction  and  random  update  proceed  similarly  to  interaction  and  ran¬ 
dom  in  the  first  simulation  for  the  first  1000  runs.  When  new  tasks  appear,  these 
procedures  use  the  best  assignment  found  so  far  for  the  old  tasks  and  assign  new 
agents  to  the  new  tasks.  Random  update  assigns  agent  randomly  for  the  new  tasks; 
interaction  uses  the  task  type  interaction  information  previously  reported  to  assign 
the  new  tasks.  Random  restart  ignores  the  best  task  found  so  far  and  randomly 
assigns  agent  to  all  of  the  tasks. 

As  expected,  there  is  a  benefit  in  average  lowest  cost  to  starting  with  the  best 
assignment  found  so  far,  as  random  update  and  interaction  perform  better  than 
random  restart  for  all  of  the  200  runs  shown  after  the  new  tasks  appear.  In  addition, 
interaction  performs  better  than  random  update  which  indicates  that  information 
from  rich  bids  recorded  before  the  new  tasks  appear  is  useful  when  assigning  the 
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Figure  2.16:  Results  for  many  task  appearanees. 


new  tasks.  The  results  shown  in  Figure  2.16  further  strengthen  this  finding:  the 
interaction  continues  to  perform  well  when  a  single  new  task  appears  after  every 
100  rounds  in  negotiation  after  the  first  1000  rounds. 


2.6  System  architecture:  interleaving  negotiation  and 
execution 

The  implementation  of  the  ideas  described  in  this  chapter  within  the  context  of 
the  challenge  problem  requires  agents  that  do  more  than  just  bid  on  tasks  and  take 
measurements.  The  system  was  architected  so  that  agents  would  normally  take  on 
what  we  will  refer  to  as  background  team  commitments.  These  are  commitments 
that  are  not  related  to  the  result  of  an  award  from  a  negotiation.  We  found  that 
establishing  and  honoring  such  team  level  commitments  played  a  central  role  in 
producing  responsive  behavior.  In  the  sensor  problem  domain,  the  background 
commitments  can  involve,  for  example,  opportunistically  responding  to  measure¬ 
ments  taken  while  scanning  an  area.  An  agent  should  work  for  the  good  of  the 
team  and  if  it  accidentally  takes  a  measurement  in  an  area  that  is  part  of  an  ongo¬ 
ing  task,  it  should  contribute  that  measurement  to  the  mediator  rather  than  discard 
it.  We  discuss  this  further  below. 

In  addition,  the  system  was  designed  to  compute  expectations  or  projections 
of  likely  future  tasks  and  also  to  monitor  task  assignments  for  progress.  Figure 
2. 17  illustrates  the  system  architecture.  The  process  begins  at  the  top  of  the  figure 
with  groups  of  agents  systematically  scanning  areas  to  detect  movements.  At  that 
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stage,  the  following  actions  are  performed: 

1.  All  sensors  turn  sectors  on  in  sequence  (no  assumptions  are  made  about 
possible  starting  points) 

2.  Each  sensor  that  detects  a  target  broadcasts  to  the  group 

3.  A  tracker  agent  (either  pre-designated  or  the  agent  with  the  lowest  id)  esti¬ 
mates  location  based  on  knowledge  of  other  known  targets 

4.  The  tracker  agent  outputs  likely  sectors  based  on  scanning  geometry  (See 
the  next  section) 

Once  a  target  is  detected,  the  sector,  S,  associated  with  it  is  output  to  a  process 
which  forms  an  initial  team  tasked  to  triangulate  and  estimate  the  target’s  loca¬ 
tion  and  velocity.  The  sector,  S,  represents  a  crude  measurement  of  its  general 
location.  The  input  to  that  stage  is  an  estimated  location  in  terms  of  S.  The  steps 
performed  at  this  stage  are: 

1.  Compute  sectors.  S',  that  overlap  with  S 

2.  Assign  nodes  associated  with  S'  to  aim  sensors  at  S  and  take  frequency  and 
amplitude  measurements 

3.  Compute  and  output  location  and  velocity  estimates 

One  of  the  agents  in  the  team  is  then  chosen  as  the  mediator  and  projects  that 
target  forward  in  time.  The  projected  path  is  then  segmented  into  a  set  of  tasks, 
projectedJrack  =  {(Li,  Ti),  (L2,  T2), . . .},  where  each  task  is  represented  as  a 
pair  {Li,Ti)  of  location  and  time,  respectively.  A  candidate  team  is  then  cho¬ 
sen  based  on  its  proximity  to  projectedJrack.  The  project  path  is  then  input  to 
the  allocation  process  (mediation  or  auction)  which  computes  an  initial  allocation 
and,  if  there  is  more  time,  conducts  additional  rounds  of  mediation  to  improve 
that  allocation.  At  that  time,  background  commitments  are  also  identified.  Once 
it  becomes  time  to  act,  the  allocation  or  set  of  commitments  is  communicated  to 
the  tracker  (in  our  system,  some  distinguished  agent(s)  responsible  for  this  task) 
which  then  begins  collecting  measurements  from  the  allocated  team  members  and 
computing  a  track  for  the  target.  Finally,  the  monitor  process  compares  the  pro¬ 
jected  and  the  actual  tracks  and  re-starts  the  entire  process  depending  on  the  result. 
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Figure  2.17:  Architecture 


a)  UMass  visualization  b)  Time  line  showing  sensor  activation 
Figure  2.18:  The  UMass  visualization  tool 
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2.6.1  Visualization  tools  and  geometric  reasoning 

One  aspect  of  the  evaluation  that  we  conducted  of  our  system  took  place  during 
development  time.  We  made  extensive  use  of  a  visualization  tool  developed  by 
the  University  of  Massachusetts  (UMASS)  (See  Figure  2.18).  Using  this  tool,  we 
were  able  to  ascertain  whether  the  sequences  of  measurements  taken  by  agents 
were  synchronized;  the  knowledge  gained  often  enabled  us  to  more  easily  locate 
problems  with  the  algorithms  we  had  developed.  For  example,  if  the  visualiza¬ 
tion  tool  detected  that  an  agent  was  taking  less  than  one  measurement  per  second 
during  tracking,  it  was  likely  that  there  was  a  problem  with  the  algorithms  that 
implemented  the  system  architecture. 

To  help  distinguish  those  problems  that  were  associated  with  the  tracker  and 
those  that  were  associated  with  the  negotiation  algorithms,  we  developed  a  geo¬ 
metric  tracker.  This  tracker  combines  a  conservative  sensor  coverage  model  to¬ 
gether  with  information  it  receives  indicating  that  an  object  is  moving  somewhere 
in  a  sector.  By  overlapping  several  zones  that  represent  the  areas  sensed  by  the 
sensors  (see  Figures  2.19  a,b),  one  can  determine  the  general  area  in  which  the 
target  is  located.  By  adding  more  sensors  and  more  measurements,  the  area  of 
uncertainty  becomes  smaller.  The  purpose  of  the  geometric  tracker  was  to  deter¬ 
mine  if  the  tracker’s  result  was  actually  plausible.  For  example,  in  Figures  2.20 
e  and  f,  the  tracker  is  reporting  an  incorrect  position  for  the  the  target.  Without 
the  geometry  tracking,  the  agents  would  have  redirected  their  sectors  to  the  new 
position  where  there  was  in  fact  no  target,  thus  causing  inefficiency  by  wasting 
resources. 

This  tracker  was  effective  for  two  reasons.  First,  the  sensors  did  an  excel¬ 
lent  job  in  detecting  movement;  the  number  of  false-positives  (i.e.,  the  number  of 
detect  movements  when  they  was  actually  none)  was  almost  zero.  Second,  the 
placement  of  the  sensors  was  done  intelligently  to  maximize  coverage  making  the 
overlapping  area  small. 

In  Figure  2.19-d,  the  four  vertically  aligned  small  dots  within  the  box  are 
the  segments  that  were  actually  auctioned  off  during  the  next  ten  second  period. 
Those  segments  were  based  on  the  projection  (darker  line  in  Figure  2.19-d  just 
above  those  dots)  of  the  target  direction. 

As  an  example  of  the  sorts  of  background  commitments  that  were  enforced 
by  the  system.  Figure  2.21  shows  a  screen  capture  where  node  5  is  not  part  of 
the  team  of  nodes  involved  in  the  current  tracking  activity;  however,  it  turns  out 
that  node  5  happens  to  “illuminate”  the  sector  where  the  target  is  located  as  part 
of  its  normal  scanning  activity.  Rather  than  discard  the  resulting  measurement,  it 
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Figure  2.19:  The  geometry  tracker  in  action. 
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Figure  2.20:  Identifying  bad  traeker  estimates. 


is  instead  integrated  opportunistieally  to  improve  the  target  estimate.  In  this  way, 
agents  would  not  diseard  data  that  could  prove  useful  to  the  team,  even  if  they 
were  not  explieitly  asked  to  eollect  sueh  data.  In  some  eases,  we  obtained  up  to 
13%  more  data  points  for  a  traek  as  a  result  of  this  addition. 

2.6.2  Experimental  results 

Data  was  colleeted  from  a  set  of  18  experiments,  where  each  experiment  repre¬ 
sented  an  average  of  three  trials,  for  a  total  of  54  runs  of  Radsim  using  different 
settings.  Eaeh  run  lasted  240  simulated  seconds,  while  varying  various  parame¬ 
ters.  The  purpose  of  the  experiments  was  to  gather  data  regarding  the  performance 
of  the  system  in  terms  of  various  parameters  sueh  as  projeetion  length,  number  of 
targets  and  eommunieation  overhead. 

2.6.3  Auction  results 

The  first  set  of  experiments  involved  sequential  auetioning  in  a  16-node  sensor 
arrangement,  traeking  varying  numbers  of  targets  traveling  at  0.5  feet/seeond.  The 
values  in  eaeh  eolumn  of  the  table  in  Figure  2.22  represent  average  Root- Mean- 
Square  (RMS)  for  the  given  number  of  targets,  and  then  the  average  number  of 
messages  reeeived  by  eaeh  node  for  the  given  number  of  targets. 
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Figure  2.21:  Opportunism 
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192.26 

15  sec. 

4.22 

7.62 
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34.81 

Figure  2.22:  Auctioning,  16  nodes,  .5  feet  per  second,  240-second  run. 


Each  row  represents  a  variation  in  the  length  of  the  track  projection  governed 
by  each  auction.  The  “standard”  length  during  design  of  the  system  was  10  sec¬ 
onds,  meaning  that  the  auctioneer  projected  the  area  to  be  covered  by  the  target 
during  the  ten  seconds  prior  to  auctioning  tasks  in  that  area.  In  addition  to  5  and 
15  second  entries,  there  are  also  entries  corresponding  to  experiments  involving 
projections  of  length  0,  which  would  constitute  a  purely  reactive  auctioneer.  That 
is,  such  an  auctioneer  would  continuously  instruct  the  other  nodes  to  immediately 
fire  at  the  current  point. 

The  second  set  of  experiments  utilized  the  same  target  speed  of  0.5  feet/second, 
and  focused  solely  on  a  10-second  projection  length.  The  system  consisted  of  a 
24-node  configuration.  The  results  are  shown  in  Table  2.23. 
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1  target 

2  targets 
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1.24 

3.12 

24.58 

Figure  2.23:  Auctioning,  24  nodes,  .5  feet  per  second,  240-second  run. 

RMS  by  Projection  Length  Figure  2.24  illustrates  a  fairly  uniform  degrada¬ 
tion  in  track  quality  as  the  number  of  targets  increased,  regardless  of  the  chosen 
projection  length  (using  sequential  auctions).  That  is,  RMS  error  increased  as  the 
nodes  must  track  more  targets,  making  the  largest  jump  with  the  addition  of  a  third 
target.  Specific  values  are  shown  in  Figures  2.26  through  2.29. 

Figure  2.24  indicates  that  a  10-second  projection  was  ideal  for  the  given  auc¬ 
tioning  system,  always  resulting  in  less  error  than  other  settings.  As  the  projection 
became  shorter,  to  5  or  0  seconds,  slightly  more  error  arose.  We  found  the  largest 
error  at  a  15  second  projection,  in  all  but  the  three  target  cases. 

When  the  projection  area  was  too  small,  agents  became  overly  reactive;  we 
found  that  there  was  not  enough  latency  and  agents  were  slightly  more  likely  to 
turn  away  from  the  correct  firing  sector  whenever  there  was  such  a  momentary 
fluctuation  in  the  track  synthesizer.  However,  for  the  same  reason,  they  were  also 
able  to  recover  fairly  quickly  from  a  momentary  error,  as  they  did  not  commit  to 
such  a  reading  for  as  long  a  period  of  time. 

We  found  that  a  long  projection  length  of  15  seconds  was  better  at  insulating 
the  system  from  fluctuations;  however,  when  the  system  received  a  bad  reading 
from  the  tracker,  it  was  more  likely  to  incorrectly  maintain  its  focus  on  the  wrong 
sector.  Specifically,  unless  the  target  later  deviated  by  such  a  margin  that  the 
auction  correction  mechanism  would  override  the  current  assignment,  the  nodes 
would  fire  upon  the  incorrect  position  for  a  full  15  seconds.  This  resulted  in 
longer  stretches  of  lost  tracks  and  hence  a  sharp  increase  in  error  in  all  cases  but 
the  three-target  one,  which  was  so  difficult  to  track  that  RMS  was  less  statistically 
significant. 

Messages  by  Projection  Length  Figure  2.25  illustrates  another  consequence  of 
varying  projection  length,  conservation  of  messages.  Here  the  average  number 
of  messages  received  by  a  node  was  compared  for  each  of  the  four  experimental 
projection  settings,  during  tracking  of  a  single  target. 

When  a  projection  covers  a  longer  span  of  time,  auctions  will  be  less  frequent, 
resulting  in  fewer  messages  within  the  same  240-second  trial.  The  reason  the 
curve  appears  more  logarithmic  than  linear  is  that  slowing  the  frequency  of  auc- 
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Avg.  Messages  RMS  (ft 


tions  additionally  mitigates  the  need  for  re-sends  and  overrides,  which  resulted  in 
disproportionately  higher  message  traffic. 

The  given  measurements  do  not  include  messages  containing  measurement 
reports  to  the  track  synthesizer,  which  varied  arbitrarily  across  projection  length 
settings. 

Track  Quality  Figures  2.26  through  2.29  report  results  on  track  quality  for  0,  5, 
10,  and  15  second  projections.  Figure  2.26  show  the  performance  of  the  0-second 
projection,  giving  specific  values  for  RMS.  Track  quality  degraded  with  the  ad¬ 
dition  of  targets,  especially  with  the  addition  of  the  third  target.  The  track  qual¬ 
ity  degradation  resulted  from  a  disproportionate  increase  in  target-differentiation 
problems:  the  results  were  very  poor  RMS  scores. 

In  particular,  the  introduction  of  a  third  target  makes  it  especially  likely  that 
two  targets  will  pass  through  the  same  sector  for  a  given  node,  resulting  in  a 
composite  reading  that  is  not  well-handled  by  the  tracker.  In  addition,  when  the 
two  targets  exit  the  area  and  then  diverge,  there  is  no  way  to  match  the  subsequent 
tracks  with  the  tracks  for  the  targets  as  they  entered  the  area.  Such  a  problem  exists 
even  if  the  paths  do  not  include  sharp  turns,  so  long  as  the  point  of  intersection 
does  not  fall  right  on  a  sector  boundary. 

Track  Quality  by  Number  of  Nodes  Figure  2.30  compares  track  quality  for 
various  numbers  of  targets  when  the  number  of  nodes  was  increased;  these  ex¬ 
periments  were  limited  to  10  second  projections.  In  particular,  the  same  target 
paths  were  tracked  by  24  nodes  rather  than  16  nodes,  resulting  in  comparable 
track  quality. 

This  was  not  surprising  in  the  case  of  one  or  two  targets,  as  they  were  already 
well-covered  by  16  nodes.  Significantly,  though,  there  was  no  improvement  in 
any  of  the  trials  for  three  targets,  where  the  16  nodes  often  were  not  able  to  de¬ 
vote  more  than  three  sectors  to  a  given  target.  Once  again  this  was  attributed  to 
problems  with  target  differentiation. 

2.6.4  Mediation  experiments 

The  purpose  of  the  second  set  of  experiments  was  to  evaluate  the  effect  of  the 
choice  of  group  decision-making  procedure  on  the  track  quality  achieved  by  a 
group  of  sensors.  We  compared  auctions  to  mediation  in  an  environment  consist¬ 
ing  of  16  nodes  using  a  10-second  projection  length.  As  shown  in  Figure  2.31, 
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the  RMS  error  achieved  by  auctions  was  similar  to  that  achieved  by  mediation  for 
both  the  one-  and  two-target  environments. 

The  main  difference  observed  between  the  two  approaches  involved  the  num¬ 
ber  of  messages  passed.  The  auction  used  less  inter-agent  communication  mes¬ 
sages  to  achieve  its  task  allocation.  In  the  system,  it  was  the  number  of  messages 
passed,  rather  than  the  length  of  those  messages,  that  most  affected  performance. 
Due  to  the  higher  number  of  messages  required  by  mediation,  and  the  large  over¬ 
head  associated  with  message  passing  in  the  system,  mediation  was  unable  to  find 
feasible  allocations  when  targets  traveled  at  the  usual  0.5  feet  second.  As  a  result, 
the  experiments  for  mediation  were  conducted  with  a  target  speed  of  0.1  feet  per 
second. 

As  illustrated  in  Figure  2.32,  the  number  of  messages  required  for  auctions 
grows  with  the  number  of  targets  (or,  group  decisions)  that  must  be  made.  It 
shows  that  mediation  is  able  to  find  an  allocation  that  is  as  effective  as  auctions 
with  a  constant  number  of  messages.  In  this  domain,  auctions  are  able  to  find  the 
best  allocation  with  a  small  number  of  messages  because  of  both  the  small  size  of 
the  search  space  and  the  small  size  of  the  information  set  that  must  be  communi¬ 
cated  to  the  auctioneer:  there  is  only  a  small  degree  of  interaction  between  tasks, 
and  the  suitability  of  tasks  to  agents  is  based  on  distance  from  the  target  which  is 
easy  to  communicate.  The  advantage  of  mediation  lies  in  its  simplicity  and  scala¬ 
bility:  the  mediator  needs  no  prior  information  about  the  domain,  and  the  agents 
respond  to  fully  specified  proposed  outcomes.  As  a  result,  mediation  would  ex¬ 
tend  well  to  problems  with  a  larger  number  of  targets  in  this  domain.  Mediation 
could  allocate  tasks  in  a  constant  number  of  messages,  while  with  auctions  the 
number  of  messages  required  would  increase.  Practical  limitations  in  our  tracking 
technology  prohibit  us  from  exploring  results  of  experiments  with  more  than  two 
targets. 

The  last  graph  compares  the  number  of  messages  per  node  utilized  by  the 
two  approaches.  While  the  auctioning  system  must  send  more  messages  for  more 
auctions  with  the  addition  of  each  target,  the  mediation  system  importantly  uses  a 
virtually  constant  amount  of  communication. 

While  auctioning  better  conserves  communicative  resources  for  one  or  two 
targets,  after  that  it  will  far  surpass  mediation  in  generating  messages.  Message 
count  is  almost  constant  for  mediation  because  each  allocation  message  governs 
all  targets  at  once. 
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Figure  2.30:  Track  quality  by  number  of  nodes. 


Figure  2.31:  Auctioning  versus  mediation:  quality. 
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Figure  2.32:  Auctioning  versus  mediation:  messages. 

2.7  Summary  and  related  work 

Research  in  auction  algorithms  has  generally  assumed  that  agents  are  self-centered. 
Such  an  assumption  is  appropriate  for  allocation  of  items  such  as  tobacco  or  flow¬ 
ers,  where  an  agent’s  well-being  is  only  affected  by  those  items  that  it  is  or  is  not 
allocated.  However,  as  we  have  discussed,  in  multi-agent  task  allocation  situations 
one  agent  may  feel  strongly  about  task  allocations  to  other  agents  in  the  group. 

Historically,  task  interaction  has  been  cited  as  an  additional  problem  with 
using  auctions  for  task  allocation.  Combinatorial  auctions  [[Sandholm  1999a, 
Hunsberger  and  Grosz  2000,  inter  alia]]  provide  a  solution  to  this  problem  by  al¬ 
lowing  agents  to  place  “exclusive  or”  bids  on  bundles  of  tasks:  an  agent  can  bid 
accurately  on  sets  of  tasks  that  exhibit  positive  or  negative  interaction.  In  addition, 
each  agent  may  have  relevant  information  as  to  which  other  agents  are  best  suited 
to  perform  which  tasks.  In  auctions,  an  agent  is  assumed  to  be  concerned  with 
only  the  items  it  expects  to  win. 

Still,  there  are  challenges  in  applying  combinatorial  auctions  to  problems  in 
which  the  amount  of  time  available  for  negotiation  is  not  known  in  advance.  Tra¬ 
ditional  combinatorial  auctions  require  a  two  step  process  where  first  agents  gen- 
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erate  and  submit  bids,  and  then  apply  a  winner  determination  algorithm  to  deter¬ 
mine  the  best  outeome.  While  the  winner  determination  phase  may  be  anytime, 
the  bid  generation  and  group  eontext  problems  that  we  have  diseussed  still  re¬ 
main.  A  related  problem  with  eombinatorial  auetions  is  eompleteness:  all  tasks 
need  to  be  alloeated.  A  eomplete  alloeation  may  not  be  possible  when  eaeh  agent 
is  generating  bids  independently. 

Iterative  eombinatorial  auetions  [[Parkes  1001]]  have  been  proposed  as  an  al¬ 
ternative  that  simplifies  the  bid  generation  phase  by  allowing  agents  to  reveal  their 
bids  inerementally.  Agents  with  complex  utility  functions  in  environments  with 
high  costs  of  communicating  them  may  substantially  benefit  from  such  a  mecha¬ 
nism. 

Task  re-allocation  has  been  addressed  by  the  literature  in  contract  net  proto¬ 
cols  [[Smith  and  Davis  1983,  Sandholm  1998]];  in  mediation,  task  re-allocations 
are  considered  during  each  new  cycle.  Sandholm  [[Sandholm  1998]]  proves  that 
the  optimal  allocation  may  only  be  found  by  allowing  arbitrarily  complex  con¬ 
tracts  (of  type  OCSM)  to  be  made;  these  contract  can  be  so  complex  that  they 
may  involve  all  the  agents  in  the  group,  thereby  eliminating  the  advantage  of  a  de¬ 
centralized  approach.  The  Allocation  Improvement  algorithm  described  in  Figure 
2.5  extends  this  work  to  provide  a  concrete  implementation  of  an  algorithm  for  the 
ordering  successive  OCSM  contract  exchanges.  In  addition,  the  work  described  in 
[[Sandholm  1998]]  does  not  address  the  problem  of  dynamic  response  to  changes 
in  the  environment. 

The  major  difference  in  the  assumptions  made  in  work  on  contract  nets  and  in 
the  work  reported  in  this  chapter  is  in  the  incentives  of  agents.  Sandholm’s  work 
assumes  that  an  agent’s  decision  making  is  based  on  myopic  best  response  and 
requires  inter-agent  monetary  payments  to  induce  task  reallocation.  The  agents 
described  in  this  chapter  are  assumed  to  be  interested  in  maximizing  the  objective 
function,  and  as  a  result  do  not  require  monetary  payments  as  an  incentive  to 
exchange  tasks. 

Game  theoretic  models  involve  analysis  and  development  of  systems  for  multi¬ 
agent  interaction.  That  work  focuses  on  ways  that  agents  in  a  group  make  deci¬ 
sions  when  the  results  of  those  decisions  co-depend  on  the  decisions  of  others 
in  the  group.  Mechanism  design  [[Parkes  1001,  inter  alia]]  involves  developing 
algorithms  aimed  at  ensuring  agents  truthfully  reveal  relevant  information  about 
their  preferences.  Under  our  assumption  that  the  goal  of  the  agents  is  to  maximize 
the  objective  function,  agents  will  not  be  willing  to  incur  a  cost  to  ensure  truthful 
revelation. 

Work  in  contract  nets  does  not  fall  squarely  into  the  area  of  game  theory  be- 
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cause  agents  are  assumed  to  exhibit  myopic  best  response  strategies  rather  than 
omniscient  self-interest.  The  assumption  made  in  this  chapter  of  an  incentive  to 
maximize  the  group  objective  function  is  yet  another  approach.  The  decision  of 
which  approach  is  more  appropriate  depends  on  the  domain  and  on  the  quali¬ 
ties  of  the  agents  that  are  negotiating.  Study  of  agents  with  bounded  rationality 
[[Sandholm  1999b,  inter  alia]]  promises  to  provide  insight  into  making  this  deci¬ 
sion. 

Research  in  Constraint  Satisfaction  Problems  (CSPs)  has  addressed  issues  re¬ 
lated  to  those  we  have  discussed  with  respect  to  dynamic  mediation  (see,  for 
example.  Chapter  10  of  [Lesser  etal  2003].  These  include  adapting  a  CSP  so¬ 
lution  to  changes  in  the  environment  when  those  changes  are  expressed  as  new 
constraints [[Dechter  1988,  Verfaillie  and  Schiex  1994]].  Algorithms  for  Distribut¬ 
ed  CSP  (DCSP)  [[Yokoo  and  Ishida  1999]]  distribute  relevant  constraint  informa¬ 
tion  among  several  agents.  Heuristic  CSP  repair  methods  have  been  explored  in 
the  context  of  dynamic  rescheduling  [[Minton  et  al  1992]].  One  important  differ¬ 
ence  with  dynamic  mediation  is  that  in  CSPs  any  variable  assignment  that  satisfies 
the  problem’s  constraints  is  a  solution;  there  is  no  welfare  function  to  be  opti¬ 
mized."^  In  addition,  DCSPs  only  support  communication  of  negative  interactions 
in  terms  of  no  goods. 

The  MARS  system  [[Fischer  et  al  1995]]  addresses  dynamic  multi- agent  rene¬ 
gotiation  focusing  on  replanning  within  a  single  agent  and  requiring  communica¬ 
tion  with  other  agents  only  when  a  new  single-agent  plan  cannot  be  found.  Dy¬ 
namic  mediation  supports  negotiation  among  all  agents. 

The  focused  D*  algorithm  is  a  real-time  replanning  algorithm  that  has  been 
applied  to  robot  path  planning  in  partially  known  environments  [[Stentz  1995]]; 
the  arc  costs  in  the  search  graph  can  change  during  the  search  process.  It  has 
been  suggested  that  negotiation  can  be  viewed  as  a  form  of  distributed  search 
[[Durfee  1999,  Ephrati  and  Rosenschein  1993]],  it  would  be  interestinqg  to  cast 
negotiation  first  as  a  distributed  search  problem  and  then  apply  an  algorithm  such 
as  focused  D*.  Research  in  self-stabilizing  algorithms  focuses  on  algorithms  that 
adapt  to  transitions  to  any  arbitrary  state  [[Dolev  2000]];  instead,  we  have  focused 
on  particular  types  of  faults. 

In  summary,  we  have  explored  the  class  of  center-based  algorithms  and,  in 
particular,  introduced  a  new  algorithm,  called  Dynamic  Mediation,  through  which 
teams  of  cooperative  agents  can  negotiate  over  the  distribution  of  tasks  while  al¬ 
lowing  solutions  to  be  adapted  to  a  dynamically  changing  environment  in  which 

"^However,  see  the  work  described  in  Chapters  10,  1 1  and  13  of  [Lesser  et  al  2003]. 
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new  tasks  and  agents  can  appear  or  faults  can  occur.  The  algorithm  makes  use  of 
prior  progress  without  requiring  that  the  entire  negotiation  be  restarted.  In  Dy¬ 
namic  Mediation  an  agent’s  bid  need  not  be  restricted  to  a  single  value  for  a  task 
but  can  include  additional  useful  information  in  the  form  of  potential  positive  and 
negative  interactions  with  other  tasks.  Rather  than  communicate  such  interaction 
information  in  terms  of  fully  specified  and  executable  tasks,  agents  maintain  task 
type  information  which  is  more  general  and  can  be  used  by  the  mediator  to  prune 
the  negotiation  space.  We  have  also  presented  a  new  algorithm,  together  with 
promising  results  in  a  synthetic  domain,  that  addresses  the  combinatorial  bid  gen¬ 
eration  problem.  Finally,  we  have  presented  a  system  architecture  that  addresses 
execution-time  issues  involving  monitoring,  re-negotiating  and  establishing  back¬ 
ground  team  commitments. 

We  have  described  experimental  results  in  various  domains  including  the  dis¬ 
tributed  sensor  challenge  problem.  The  experimental  results  suggest  that  dynamic 
negotiation  methods  have  significant  promise.  Based  on  our  experiments,  we  con¬ 
clude  that  it  is  best  to  use  dynamic  mediation  when  time  is  an  important  resource 
and  negotiations  must  end  quickly.  If  negotiation  time  was  unlimited,  the  quality 
of  outcomes  attained  by  static  methods  would  eventually  catch  up  to  those  attained 
by  dynamic  methods.  We  are  currently  experimenting  with  extensions  that  relax 
the  restriction  to  non  inter-agent  dependencies  in  proposals  and  that  involve  the 
use  of  a  general  task  abstraction  hierarchy  to  the  bid  generation  problem. 
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Chapter  3 

Resource  allocation  in  very 
large-scale  environments 

3.1  The  Distributed  Dispatcher  Manager 

In  this  chapter  we  consider  the  complexities  that  arise  when  one  scales  up  dis¬ 
tributed  agent  networks  to  thousands  of  sensor  agents  and  thousands  of  objects. 
We  describe  a  system  for  effectively  managing  such  networks,  called  the  Dis¬ 
tributed  Dispatcher  Manager  (DDM).  DDM  differs  in  a  number  of  important  ways 
from  other  systems. 

Mobility:  We  focus  on  mobile  sensor  agents;  agents  remain  relatively  simple  and 
lightweight. 

Organization:  The  complexity  of  the  distributed  control  problem  for  such  mas¬ 
sive  agent  systems  is  managed  through  a  hierarchical  organization  in  which 
teams  of  agents  are  associated  with  sectors;  teams  themselves  can  represent 
elements  of  other  teams. 

Tracking:  We  have  extended  the  tracking  algorithms  discussed  so  far  in  this  book 
so  that  a  single  agent  can  track  an  object  by  taking  multiple  sequential  mea¬ 
surements  and  combining  them.  We  assume  that  multiple  objects  can  be 
discriminated  within  the  field  of  a  sensor.  Finally,  we  do  not  focus  on  the 
tracking  of  a  particular  object,  but  rather  on  adequate  coverage  of  given 
areas. 

Task  synchronization:  One  consequence  of  the  above  extensions  to  the  tracking 
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algorithm  is  that  the  communication  requirements  between  agents  are  less¬ 
ened  and,  in  particular,  synchronization  between  agents  is  not  necessary. 

Sensors:  DDM,  in  its  current  state,  does  not  manage  the  usage  of  certain  re¬ 
sources,  such  as  sensor  power  utilization.  In  addition,  our  treatment  of  sen¬ 
sor  noise  is  a  bit  different  from  the  systems  described  so  far  in  that,  in  DDM, 
some  measurements  are  lost,  but  the  ones  that  are  not  lost  are  accurate. 

Simulation:  In  order  to  experiment  with  DDM  in  domains  of  the  above  sort, 
we  have  developed  a  simulation  which  reflects  the  above  assumptions.  In 
some  cases,  the  set  of  environments  constructed  for  testing  the  system  have 
been  complicated  to  reflect  such  complexities;  in  other  cases,  a  number  of 
simplifying  assumptions  have  been  made  so  as  to  be  able  to  focus  on  the 
scale-up  issues  (for  example,  we  focus  on  objects  which  move  in  a  straight 
lines). 

DDM  organizes  the  Dopplers  in  teams,  each  with  a  distinguished  team  leader. 
A  team  is  assigned  a  specific  geographic  sector  of  interest.  Each  Doppler  can 
act  autonomously  within  its  assigned  area  while  processing  local  data.  Teams  are 
themselves  grouped  into  larger  teams.  Communication  is  restricted  to  flow  only 
between  an  agent  (or  team)  and  its  team  leader.  Each  team  leader  is  provided  with 
an  algorithm  to  integrate  information  obtained  from  its  team  members. 

We  present  results  from  experiments  that  involved  hundreds  of  agents  and 
more  than  a  thousand  objects.  These  results  support  our  hypothesis  that  DDM  is 
successful  in  large-scale  environments.  The  experiments  also  allow  us  to  examine 
the  question  of  how  to  determine  the  number  of  levels  of  the  DDM  hierarchy 
in  a  large-scale  system.  Our  results  show  that  as  the  number  of  levels  of  the 
hierarchy  increases  the  quality  of  the  results  slightly  decreases.  However,  the  time 
complexity  of  the  system  decreases  exponentially.  Consequently,  we  found  that 
using  too  few  levels  may  not  suffice  to  solve  the  global  problem. 

In  the  next  section  we  describe  the  large  scale  ANTS  problem  and  present 
the  main  elements  of  DDM.  Subsequently,  we  detail  the  comprehensive  study  we 
conducted  to  evaluate  the  hierarchical  solution.  We  conclude  by  discussing  the 
major  contribution  of  our  solution  to  the  large-scale  agent  system  challenges  in 
terms  of  capability,  accuracy,  efficiency,  cost-effectiveness,  robustness  and  fault 
tolerance. 


55 


3.2  The  large  scale  ANTS  challenge  problem  and 
the  DDM 

We  consider  a  large-scale  environment  where  there  are  many  mobile  targets  and 
many  mobile  Dopplers  moving  in  a  specified  geographic  area.  The  goal  of  the 
DDM  system  is  to  track  the  targets.  Each  target  moves  at  a  steady  velocity  along 
a  straight  line.  Targets  differ  from  each  other  by  their  motion  properties.  Motion 
properties  define  the  target  state,  location  and  velocity,  at  any  given  time.  Both 
location  and  velocity  are  vectors.  The  location  vector  is  referred  to  in  the  physics 
literature  as  the  radius  vector,  the  vector  from  the  axis  origin  (0,0)  to  a  target.  The 
velocity  vector  describes  the  change  in  a  target’s  location  every  second.  A  steady 
motion  equation  may  look  like  the  following  [[Feynman  1963]]: 

f{t)  =  {fo  +  v-t,v) 

where  fo  is  the  location  of  the  target  at  time  t  =  0  and  v  is  the  velocity  vector. 
The  goal  of  the  DDM  system  is  to  identify  the  motion  equation  of  each  target  in 
the  area. 

DDM  uses  agents  to  find  the  set  of  motion  equations  that  represent  the  tar¬ 
gets.  We  will  refer  to  this  set  as  the  global  information  map,  denoted  by  InfoMap. 
The  base  level  of  the  DDM  consists  of  mobile  sampling  agents.  Mobile  sampling 
agents  are  agents  that  use  simple  Doppler  sensors  to  sense  targets.  Each  of  these 
sampling  agents  moves  autonomously  according  to  a  predefined  movement  algo¬ 
rithm.  Each  agent  periodically  stops  briefly  to  take  measurements.  When  an  agent 
takes  measurements  we  refer  to  its  state  as  the  viewpoint  from  which  a  particular 
object  state  was  measured.  The  measurements  alone  are  not  sufficient  to  identify 
the  exact  state  of  the  observed  target.  The  sampler  agent  can  only  determine  two 
possible  states  of  the  observed  target  based  on  four  consecutive  measurements. 
Only  one  of  them  is  the  correct  state  of  the  target  at  the  observation  time.  Using 
such  estimates,  a  sampler  agent  then  produces  what  we  call  a  capsule.  A  capsule 
consists  of  (a)  two  possible  states  of  a  target,  (b)  the  time  associated  with  the  mea¬ 
surements  that  were  used  to  infer  the  possible  target  states,  and  (c)  the  state  of  the 
agent,  i.e.,  its  location  and  orientation  during  the  measurements.  Eater  on  we  will 
show  how  sampling  agents  transform  measurements  into  capsules. 

The  DDM’s  goal  is  to  estimate  the  motion  equations  of  the  targets  using  cap¬ 
sules.  The  equations  can  be  deduced  from  a  sequence  of  target  states.  The  main 
problem  faced  by  the  DDM  with  respect  to  a  capsule  is  how  to  choose  the  correct 
state.  To  resolve  this  problem,  the  DDM  makes  use  of  the  fact  that  two  states 
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may  be  the  correct  states  of  the  same  target  if  they  are  instances  of  the  same  mo¬ 
tion  equation.  We  introduce  a  relation  called  ResBy  that  holds  if  the  state  of  one 
target  comes  about  from  another.  DDM  tries  to  match  states  of  capsules  using 
this  relation  and  makes  linked  lists  of  target  states.  Each  linked  list  represents  a 
potential  target  movement.  We  refer  to  this  linked  list  of  target  states  as  a  path. 
DDM  also  attempts  to  determine  the  accuracy  of  potential  paths.  It  assumes  that 
if  two  or  more  target  states  in  a  path  were  recorded  by  different  agents,  then  the 
path  represents  target  motion  accurately. 

In  a  large-scale  environment,  DDM  would  have  to  link  many  capsules  from  the 
entire  area  of  interest.  Applying  the  relation  ResBy  many  times  is  time  consuming. 
However,  there  is  a  low  probability  that  capsules  created  based  on  measurements 
taken  far  away  from  one  another  will  fit.  Therefore,  the  solution  is  distributed. 
The  DDM  uses  hierarchical  structures  to  construct  a  global  infoMap  distributively. 
The  lower  level  of  the  hierarchy  consists  of  the  sampling  agents.  These  agents 
are  grouped  according  to  their  location.  Each  group  has  a  leader.  The  sampling 
agents  create  capsules  and  send  them  to  their  group  leaders.  The  second  level  of 
the  hierarchy  consists  of  the  sampler  group  leaders.  Each  sampler  group  leader 
obtains  capsules  only  from  the  sampling  agents  in  its  group.  This  limits  the  time 
that  it  needs  to  process  the  capsules,  but  may  reduce  its  ability  to  link  between 
states  since  it  obtains  only  a  portion  of  the  capsules. 

The  sampler  leaders  are  also  grouped  according  to  their  areas  of  responsibil¬ 
ity.  Each  such  group  of  sampler  leaders  is  associated  with  a  zone  group  leader.  A 
sampler  leader  sends  its  zone  leader  its  estimates  of  target  motion  equations  in  its 
area  and  capsules  that  it  was  not  able  to  use  in  the  estimation  process.  The  third 
and  higher  levels  of  the  hierarchy  consist  of  zone  group  leaders,  which  in  turn,  are 
also  grouped  according  to  their  areas  of  interest.  Zone  leader  agents  are  respon¬ 
sible  for  retrieving  and  combining  information  from  their  group  of  agents.  That 
is,  they  try  to  estimate  the  motion  equations  based  on  the  estimates  they  received 
from  agents  in  their  zone.  All  communication  is  restricted  to  exchanges  between  a 
group  member  and  its  leader.  The  information  unit  sent  by  leaders  to  their  higher- 
level  leaders  is  called  a  local  information  map.  A  local  information  map,  which 
we  refer  to  as  localinfo,  is  a  triple  consisting  of:  (i)  an  accurate  solution  compo¬ 
nent  consisting  of  a  set  of  motion  equations  that  with  a  high  probability  represent 
targets;  (ii)  a  semi-accurate  solution  component  consisting  of  a  set  of  paths;  and 
(iii)  a  set  of  capsules  that  were  not  used  for  the  formation  of  any  motion  equation 
of  (i)  or  any  path  of  (ii).  That  is,  each  zone  leader  obtains  local  information  maps 
of  all  its  agents  and  combines  them  into  an  information  map  of  its  area.  Thus,  the 
top-level  leader  agent  forms  a  local  information  map  of  the  entire  area. 
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To  conclude,  the  formation  of  a  global  information  map  integrates  the  follow¬ 
ing  processes: 

•  Each  sampling  agent  gathers  raw  sensed  data  and  generates  capsules. 

•  Every  dT  seconds  each  sampler  group  leader  obtains  from  all  its  sampling 
agents  their  capsules  and  integrates  them  into  its  localinfo. 

•  Every  dT  seconds  each  zone  group  leader  obtains  from  all  its  subordinate 
group  leaders  their  localinfo  and  integrates  them  into  its  own  localinfo. 

•  As  a  result,  the  top-level  group  leader  localinfo  contains  a  global  informa¬ 
tion  map. 

We  have  developed  several  algorithms  to  implement  each  process.  In  the  next 
section  we  present  those  algorithms. 


3.3  Descriptions  of  algorithms 

The  first  algorithm  describes  the  method  for  constructing  a  capsule  from  raw 
sensed  data.  This  algorithm  is  activated  by  each  sampler  agent  and  uses  con¬ 
secutive  raw  sensed  data.  The  second  and  the  third  algorithms  describe  the  way 
in  which  every  group  leader  processes  incoming  local  information  maps  of  the 
sub-areas  of  its  zone  to  generate  a  more  comprehensive  local  information  map  of 
its  entire  area. 

Eirst,  we  will  describe  the  main  data  structures  that  the  agents  use.  In  these 
data  structures  specifications  and  in  the  algorithm  descriptions,  we  will  use  a  dot 
notation  to  describe  a  field  in  a  structure,  e.g.,  c.sa  is  the  sampling  agent  field  of 
the  capsule  c. 

Target  state:  s  =  {D,  V)  where  D  is  the  location  of  the  target  and  V  is  its  ve¬ 
locity  at  a  given  time.  Eor  example:  ((100, 100),  (2,  —1))  is  a  target  state 
where  the  target  was  at  location  X  =  100  and  F  =  100  and  its  velocity  was 
14  =  2  and  Vy  =  —1. 

Sensing  agent  state:  sa  =  (H,  O)  where  D  is  the  location  of  the  sensor  and  O  is 
the  orientation  of  the  sensor.  Eor  example:  ((150, 150),  7r/2)  is  an  agent’s 
state  where  the  sensing  agent  was  at  location  X  =  150  and  Y  =  150  and 
had  an  orientation  of  7r/2. 
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Figure  3.1:  DDM  hierarchy  information  flow  diagram. 
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Capsule:  c  =  {t,  sa,  {si,  S2})  where  t  is  the  time  of  the  sampling,  sa  is  the  sens¬ 
ing  agent’s  state  during  the  time  the  measurements  that  were  used  for  the 
formation  of  the  eapsule’s  target  states  were  taken,  and  si,  S2  are  two  possi¬ 
ble  target  states  eomputed  by  the  sampler  agent.  An  example  is, 

(30,  ((150, 150),  7r/2),  {((100, 100),  (2,  -1)),  ((200,  200),  (-2,  -3))}) 
where  the  time  of  the  sampling  was  30  and  the  sensing  agent  state  was 
((150, 150),  7r/2)  where  the  two  possible  target  states  were 
((100, 100),  (2,  -1)),  and  ((200,  200),  (-2,  -3)). 

Path  point:  tTj  =  (f*,  so*,  s*)  where  ti  is  the  time  of  the  point,  so*  is  the  sensing 
agent’s  state  during  the  time  the  measurements  that  were  used  to  eompute 
the  point’s  state  were  taken,  and  s,  is  the  target  state  of  the  point.  For  ex¬ 
ample:  (30,  ((150, 150),7r/2),  ((100, 100),  (2,  —1)))  where  while  the  time 
of  the  path  point  was  30,  the  sensing  agent  state  was  ((150, 150),  7r/2)  and 
the  target  state  was  ((100, 100),  (2,  —1)). 

That  is,  while  in  a  capsule  there  are  two  possible  states  associated  with 
measurements,  in  a  path  point  there  is  only  one.  The  goal  of  the  agents  is  to 
choose  the  correct  one. 

Path:  p  =  (tti  . . .  7r„)  where  tti  and  7r„  are  the  first  and  the  last  path  points.  Every 
pair  of  path  points  in  a  path  satisfies  the  ResBy  relation. 

Target  state  function:  =  {'Ks.Ss.D+tts.Ss.V -{t—TTs-t,  t),'Ks-Ss-V)  valid 

in  the  range  TVs-t.-ne-t.  For  example:  if 

TT,  =  (30,  ((150, 150),  /pi/2),  ((100, 100),  (2,  -1)))  and 

TTe  =  (40,((450,25),/W4),((120,90),(2,-l)))thenA^,,^(f)  = 

((100, 100)  -I-  (2,  —1)  •  {t  —  30),  (2,  —1))  and  at  t=30  we  have  that 
/^^^^^(30)  =  ((100, 100),  (2,  —1))  and  at  t=40  we  will  have  that, 

/^^^^^(40)  =  ((120,  90),  (2,  —1)).  For  simplicity,  we  will  refer  to  this  func¬ 
tion  as  f{t)  =  {D{t),  V)  and  to  its  properties:  f.D{t),f.V{t),  f.ts  and  f.tg. 

Local  information  map:  ((/\  . . . ,  /^),  (p*,  ■■■,Pi),  {ci, . . . ,  Cm))-  To  form  a  lo¬ 
cal  information  map  out  of  raw  sensed  data  agents  should  follow  a  set  of 
steps  corresponding  to  the  following  stages  of  data  evolution:  (i)  measure¬ 
ments,  (ii)  capsules,  (iii)  path,  (iv)  target  state  function  and  (v)  local  infor¬ 
mation  map. 
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3.3.1  The  raw  data  transformation  and  capsule  generation  al¬ 
gorithm 


The  process  proceeds  as  follows.  A  sampling  agent  deduces  a  set  of  possible  tar¬ 
get  states  at  a  given  time  to  form  a  capsule.  A  sampling  agent  accomplishes  this 
by  first  taking  four  consecutive  measurements  and  then  creating  a  new  capsule,  c, 
such  that  the  time  associated  with  that  capsule  corresponds  to  the  time  of  the  last 
measurement.  The  state  of  the  sampling  agent  while  taking  the  measurements  is 
given  by  c.sa.  The  target  states  resulting  from  the  application  of  the  function  mw- 
DataTransformation  to  the  four  consecutive  measurements  is  assigned  to  c. states. 
The  agent  stores  the  capsules  until  it  is  time  to  send  them  to  its  group  leader.  Af¬ 
ter  delivering  the  capsules  to  the  group  leader  the  sampler  agent  deletes  them.  We 
will  now  show  now  how  an  agent  transforms  four  consecutive  measurements  into 
a  capsule. 

A  measurement  is  a  pair  of  amplitude,  rj,  and  radial  velocity,  Vr,  values  for  each 
sensed  target.  A  radial  velocity  is  the  velocity  of  a  target  towards  the  measuring 
Doppler.  Given  a  measurement  from  a  Doppler  radar  the  target  location  can  be 
computed  using  the  following  equation: 


k-e  <y 
Vi 


(3.1) 


where,  for  each  sensed  target,  i,  is  the  distance  between  the  sensor  and  i;  6^ 
is  the  angle  between  the  sensor  and  v.  Oils  the  measured  amplitude  of  /5  is  the 
sensor  beam  angle;  and  k  and  a  are  characteristics  of  the  sensors  and  influence 
the  shape  of  the  sensor  detecting  area.  It  is  possible  to  infer  the  exact  location 
of  a  target  by  intersecting  three  different  measurements  taken  at  the  same  time 
by  three  different  Dopplers.  Using  the  intersection  method  is  very  problematic 
in  large  scale  systems  as  it  requires  full  synchronization  and  cooperation  between 
groups  of  three  Dopplers.  Thus,  the  DDM  uses  measurements  from  only  one 
Doppler  to  deduce  a  possible  target  state. 

It  is  known  that  if  the  location  of  an  object  at  time  0  is  Dg  and  its  velocity  is 
V  then  the  next  location,  Di,  at  time  1  of  the  object  is  given  by: 


O.  =  -Do  +  Vdt 


(3.2) 


where  Dt  is  the  displacement  of  the  object  in  time  t.  If  we  consider  the  distance 
from  the  center  of  the  Doppler  we  have  that 

Vrdt  (3.3) 
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Figure  3.2:  Target  sampling  by  one  Doppler. 

where  Rt  is  the  displacement  from  the  center  of  the  sensor  at  time  t  and  Vr  is 
the  relative  velocity  between  the  Doppler  and  the  target  in  the  direction  of  the 
Doppler’s  center.  We  assume  that  the  acceleration  of  a  target  over  a  short  period 
of  time  is  zero.  The  next  target  location  is  therefore: 

Ri  =  Rq  -\-  Vr  ■  {t\  —  to)  (3.4) 

We  denote  {ti  —  to)  by  ti^.  From  the  relation  between  R,  6  and  77,  given  by 
equation  3.1,  we  can  find  the  next  angle  as  a  function  of  the  former. 

In  Figure  3.2  the  dark  arrow  represents  a  target  movement  vector,  the  small 
circles  along  the  target  movement  represent  target  locations,  {Ro,  0o),{Ri,0i)  and 
(i?2,  ^2),  at  time  to,ti,  and  t2,  respectively,  as  sensed  by  the  Doppler.  Following 
the  projection  of  Ri  and  R2  over  Ro,  and  Ri^r^  and  respectively,  as  shown 
by  the  dotted  line,  we  have  the  following: 

Ri,Ro  -  Rq  _  R2,Ro  -  Rq  5) 

Ri  ■  sin{6i  —  60)  R2  •  sin{92  —  Oo) 

Trigonometrically,  we  may  write  Ri,r^  and  i?2,i?o  as 

Ri,Ro  =  -^1  ■  cos{9i  —  9o) 
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and 


R2,Ra  —  R2  •  cos{92  —  Oq) 


by  equation  3.1,  0,  can  be  written  as 

=  /5  ±  ^-a  ■  ln{f  ■  Rl)  (3.6) 

By  substituting  Ri  as  given  by  equation  3.4  into  equation  3.6  we  can  deduce  that 
the  location,  (i?i,  0i),  at  of  a  sensed  target  may  be  written  as  a  function  of  9q  as 
specified  in  the  next  proposition. 

Theorem  3.1  Assuming  that  the  acceleration  of  a  target  in  a  short  time  period, 
fi,o,  is  zero,  the  next  location  of  the  target  is  then  given  by 

01  (0o)  =  /3  ±  ■  Hf  ■  {Ro  +  Ko  •  h,o)^)  (3.7) 


while 


and 


Rq  — 


Where  Ro,9o,r}o,  Ko  represent  values  of  the  target  at  time  t  =  0  and  0i,  rj\  repre¬ 
sent  values  of  the  target  at  time  t  =  1.  The  same  holds  for  the  next  angle,  02- 

An  agent  can  use  the  relationship  given  by  equation  3.7  for  0o,  0i  and  02  to¬ 
gether  with  equation  3.5  to  find  0i  and  02  from  9q.  However,  the  value  of  9q  is  not 
known  and  thus  can’t  be  used  in  equations  3.7  and  3.5.  Therefore  the  algorithm 
examines  the  range  of  0, ...,  27r  to  determine  which  value  of  9q  solves  these  two 
equations. 

Note  that,  given  a  specific  value  of  0o,  the  result  of  equation  3.7  may  lead  to 
two  valid  solutions.  This  is  the  reason  for  the  use  of  capsules:  the  sampling  agent 
will  leave  the  decision  of  determining  the  correct  target  states  to  the  higher  levels. 

Equation  3.5  cannot  be  solved  symbolically  and  therefore  the  sampler  agent 
uses  computational  methods.  The  sampler  agent  explores  the  range  of  0o  and 
looks  for  suitable  locations  corresponding  to  9q.  Only  certain  angles  will  fit  the 
above  equation.  To  be  more  precise,  the  sampler  agent  uses  one  more  sample  and 
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Find  6q  function 

Input:  sa,sampleo,  samplei,  sample2 

Output:  9q 

minimum_diff=e 

min_6*o=-l 


For  00  =  0  to  27r  in  5  steps 

calculate  6i  using  6o  by  equation  3.7 

calculate  62  using  60  by  equation  3.7 

diff  =  the  difference  between  the  left  side  the  right  side  of 

equation  3.5  using  61  and  62 

if  (diff  <  minimum_diff) 

minimium_diff=diff 

min_0o=0o 


Return  min_0o 


Figure  3.3:  Finding  a  value  of  0o- 

applies  the  same  mechanism  to  61,62  and  63.  Comparing  the  results  from  both 
cases  improves  the  accuracy  of  the  results.  The  calculated  angles  are  used  to  form 
a  set  of  possible  pairs  of  location  and  velocity  of  a  target  (i.e.,  a  capsule).  In  the 
algorithms  of  figures  3.3,  3.4  and  3.5  we  will  use  the  notation  samplei  to  represent 
a  measurement  {rj,  I4)  att  =  i. 

Theorem  3.2  The  time  complexity  of  the  capsule  generation  algorithm  is  0(  1 ). 

Proof:  While  generating  a  capsule,  the  rawDataTransformation  function  uses  the 
Find  function  twice.  The  time  complexity  of  considering  the  range  of  all  angles 
from  0  to  27r  in  Find  is  0(1)  as  it  does  the  same  simple  assignments  2tt  /d  times. 
Therefore,  the  time  complexity  of  the  whole  algorithm  is  0(1). 

However,  despite  the  low  order,  this  algorithm  can  be  CPU  intensive.  A  sam¬ 
pler  agent  applies  this  algorithm  every  four  consecutive  measurements.  Thus,  it 
may  have  to  apply  it  many  times  if  it  acquired  many  measurements  or  if  many  tar¬ 
gets  passed  through  its  sector.  A  sampler  agent  may  sometimes  not  have  sufficient 
resources  to  execute  this  algorithm  many  times  in  real  time.  In  such  cases,  one 
can  consider  using  simpler  sampling  agents,  i.e.,  with  smaller  detection  sectors. 
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rawDataXransformation  function 

Input:  sa,sampleo,  samplei,  sample2^  samples 

Output:  target  states 

90  =  Find  9o{sa,  sampleo,  samplei,  sample2) 

91  =  Find  6*o(sa,  samplei,  sample2,  samples) 

if  {9o  -1  &&  9i  ^  -1) 

9s  =  calculate  9s  using  9o  by  equation  3.7 
91  =  calculate  using  9i  by  equation  3.7 
if  (the  difference  between  9s  and  <  epsilon  ) 

Return  (D{9s),V{9s)),  (D{-9s),V{-9s)) 

else 

Return  null 


Figure  3.4:  The  rawDataTransformation  function 


Capsule  generation  algorithm 

Input:  sa, sampleo,  samplei,  sample2,  samples 

Output:  capsule 

targetStateSet=rawDataTransformation(sa, sampleo,  samplei,  sample2,  samples) 

If  (targetStateSet  ^  null) 
capsule=nev^  Capsule() 
capsule.sa=sa 

capsule. states  =  targetStateSet 
else 

capsule=  null 


Return  capsule 


Figure  3.5:  Capsule  generation  algorithm. 
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which  will  reduce  the  computation  load  on  a  single  agent.  One  may  also  consider 
taking  fewer  samples. 

Example:  We  will  now  present  an  example  of  how  a  sampler  agent  forms  a 
capsule  from  four  consecutive  measurements.  Consider  a  case  of  a  sampler  agent 
located  at  the  coordinates  Da  =  (200,  200)  with  orientation  of  0  degrees.  The 
sampler  uses  a  Doppler  with  the  characteristic  k  =  1  and  <7  =  1  and  maximum 
detection  range  of  100  meters  (see  Figure  3.1).  Consider  the  following  measure¬ 
ments  taken  by  the  Doppler. 


Time 

Tj 

K 

0 

1.08E-04 

0.014141 

1 

l.llE-04 

0.042405 

2 

1.15E-04 

0.070619 

3 

1.18E-04 

0.098748 

Scanning  the  range  of  0, ...,  27r  for  the  value  of  6q  in  Find  9o  function,  the  al¬ 
gorithm  computes  9i{9q)  and  92{9q)  using  equation  3.7.  At  the  end  of  the  scanning 
loop  the  algorithm  finds  out  that  while  9q  had  the  value  of  —0.7854  the  difference 
between  the  left  side  and  the  right  side  of  equation  3.5  was  minimal.  Doing  the 
same  in  the  case  of  9i,  the  algorithm  finds  that  when  9i  was  —0.7654  the  differ¬ 
ence  between  the  left  side  and  the  right  side  of  equation  3.5  was  minimal. 

The  algorithm  then  uses  equation  3.5  to  find  that  6*3(6*o)  is  —0.72547  and 
that  9l{9i)  is  also  —0.72547.  Realizing  that  the  values  of  the  calculated  93(9o) 
and  are  equal,  the  algorithm  constructs  two  targets  states.  The  first  is 

(D(03),  V(93)}  and  the  second  is  (i7(— ^3),  C(— 03)).  The  values  of  D{—93)  and 
C(— ^3)  are  given  by  the  following  equations: 

•  D{93)  =  {R{9)  ■  sin{9)  +  Da-x,  R{9)  ■cos{9)  +  Da.y) 

•  V{9)  =  {V^  ■  sin{9),Vj.  ■  cos{9)) 

where  R{9)  is  given  by  equation  3.8  and,  in  our  case,  R{93)  =  70.8378.  Accord¬ 
ing  to  our  example  the  two  target  states  will  be  ((153,  253)  (1, 1))  and  ((247,  253)  (—1, 1)). 

3.3.2  Leader  localinfo  generation  algorithm 

Every  dT  seconds  each  group  leader  performs  the  localinfo  generation  algorithm. 

Each  group  leader  holds  its  own  localinfo.  The  leader  starts  by  purging  data  older 
than  a  predefined  r  seconds  before  processing  new  data  to  avoid  data  overloading. 
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Updating  locallnfo  involves  three  steps:  (i)  obtaining  new  information  from  the 
leader’s  subordinates;  (ii)  finding  new  paths;  and  (iii)  merging  the  new  paths  into 
locallnfo. 

Figure  3.6  describes  the  algorithm  for  (i)  in  which  every  leader  obtains  infor¬ 
mation  from  its  subordinates.  The  sampler  group  leader  obtains  information  from 
all  of  its  sampling  agents  for  their  unusedCapsules  and  adds  them  to  its  unused- 
Capsules  set.  The  zone  group  leader  obtains  from  its  subordinates  their  locallnfo. 
It  adds  the  unusedCapsules  to  its  unusedCapsules  and  merges  the  infoMap  of  that 
locallnfo  to  its  own  locallnfo. 

Merging  of  functions  is  performed  both  in  steps  (i)  and  (iii):  this  is  needed 
since,  as  we  noted  earlier,  target  state  functions  that  a  leader  has  inserted  into  the 
information  map  are  accepted  by  the  system  as  correct  and  will  not  be  removed. 
However,  different  agents  may  sense  the  same  target  and  therefore  it  may  be  that 
different  functions  coming  from  different  agents  will  refer  to  the  same  target.  The 
agents  should  recognize  such  cases  and  keep  only  one  of  these  functions  in  the 
infoMap.  We  use  the  next  lemma  to  find  and  merge  identical  functions. 


Lemma  3.1  Let  p^  =  (tt],  . . . ,  =  (tt^ , . . . ,  tt^)  be  two  paths,  where  tt]  = 

(f*-,  so*-,  sp  and 


If  ResBy{{tl,sl),  (fp 


=  {nl.Ss.D  -  nl.Ss.V  ■  {t  -  7rpf),  Trps^.U) 

=  {'kI.Ss.D  -  'kI.Ss.V  ■  {t  -  TTpf),  TTpS^.U) 
sp)  then  for  any  /Ji  i(f),  /^2,^2(f), 

s  v'  e  s  v'  e 


The  merge  Functions  algorithm  shown  in  Figure  3.6  is  based  on  lemma  3.1. 
In  that  algorithm,  the  leader  uses  the  ResBy  relation  to  check  whether  the  first 
state  of  the  target  state  function  results  from  the  first  state  of  a  different  target 
state  function.  If  one  of  the  states  results  from  the  other,  the  leader  changes  the 
minimum  and  the  maximum  triplets  of  the  target  state  function.  The  minimum 
triplet  is  the  starting  triplet  that  has  the  lowest  time.  The  maximum  triplet  is  the 
ending  triplet  that  has  the  higher  time.  Intuitively,  the  two  state  functions  are 
merged  and  the  new  function  corresponds  to  the  largest  range  given  the  found 
points.  In  case  that  a  leader  cannot  find  any  target  state  function  to  meet  the 
subordinate’s  function,  the  leader  will  add  it  as  a  new  function  to  its  infoMap. 


Proposition  3.1  The  time  complexity  of  the  obtaining  new  information  algorithm 
is  0{T^)  where  T  is  the  number  of  targets  in  the  t  seconds  window  of  time  in 
which  target  information  is  kept  by  an  agent. 
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Step  1  -  Obtaining  new  information  algorithm 

In:  Localinfo 
Out:  updated  localinfo 


if  activated  as  Samplergroup  leader 
for  each  subordinate  sampler,  sampler 

additionalCapsules  =obtain  set  of  capsules  from  each  sampler 
localinfo. unusedCapsules=  localinfo. unusedCapsules  U 
additionalCapsules 
else  %  activated  in  Zone  group  leader 
for  each  subordinate  leader,  leader 

%in  this  part  we  identify  identical  functions  and  leave  only  one  of  them 
additionalLocalInfo  =  ask  each  leader  for  its  local  info 
additionalCapsules=additionalLocalInfo. unusedCapsules 
additionalInfoMap=additionalLocalInfo.infoMap 
localinfo. unusedCapsules=localInfo. unusedCapsules  U 
additionalCapsules 

mergedFunctions(localInfo.infoMap,additionalInfoMap); 
return  infoMap,  unusedCapsules 

Figure  3.6:  Obtaining  new  information  algorithm. 
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mergeFunctions  algorithm 

In:  target  function  sets:  F,  F’ 
Out:  updated  target  function:  F 


for  eachstatefunction,  /*,  in  F’ 
merged  =  false 

for  each  state  function,  f^,  p  p  f^,  in  F  &&  not  merged 
if  (/*.:D(0)  -  P.D{0)  <  eD  &&  p.V  -  py  <  eV) 
Pp  =  min{py,Py) 

P.te  =  max{p.te,p.te) 
merged=true 
if  (not  merged) 

F  =  F  U  {p} 


Return  F 


Figure  3.7:  mergeFunctions  algorithm. 


Proof:  While  obtaining  new  information,  agents  implementing  the  algorithm 
query  each  subordinate  agent  for  information.  The  number  of  subordinate  agents 
is  predefined  and  therefore  constant.  In  the  case  of  the  sampler  leader  the  algo¬ 
rithm  combines  all  capsules.  The  number  of  capsules  depends  on  the  number  of 
targets  up  to  a  constant  factor.  The  constant  factor  depends  on  predefined  constant 
values,  such  as,  the  number  of  agents  and  time  period  for  sampling.  Therefore 
the  time  complexity  for  the  sampler  leader  component  is  0(r).  However,  for  each 
subordinate  leader,  the  zone  group  leader  also  performs  the  mergeFunctions  algo¬ 
rithm.  The  time  complexity  of  the  mergeFunctions  algorithm  is  0{T^)  as  it  runs 
over  a  set  of  task  state  functions  for  every  other  task  state  function  in  another  set. 

The  second  step,  as  shown  in  Figure  3.10  is  conducted  by  every  leader  to  find 
paths  and  extend  current  paths  given  a  set  of  capsules.  In  order  to  form  paths  of 
capsules,  the  agent  should  choose  only  one  target  state  out  of  each  capsule.  This 
constraint  is  based  on  the  following  lemma.  According  to  this  lemma  one  state 
of  one  capsule  cannot  be  in  a  ResBy  relation  with  two  different  states  in  another 
capsule,  with  respect  to  the  capsule’s  time. 

Lemma  3.2  Let  sa^,  (sj,  s\))^  (sf,  s|))  then  if 
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Sensing  Agent  State 

Target  State  A 

Target  State  B 

Capsule 

Time 

Focation 

Orientation 

Focation 

Velocity 

Focation 

Velocity 

0 

0 

0,0 

0 

100,100 

5,5 

60,60 

3,-1 

1 

0 

0,0 

0 

50,50 

2,3 

30,40 

-2,-2 

2 

1 

0,0 

0 

105,105 

5,5 

63,59 

3,-1 

3 

1 

0,0 

0 

70,80 

2,3 

52,53 

2,3 

4 

2 

0,0 

0 

66,58 

-1,3 

110,110 

5,5 

5 

3 

10,10 

0 

56,59 

2,3 

30,30 

3,3 

Figure  3.8:  Example  of  a  set  of  unusedCapsules  received  by  the  Find- 
ing_new_paths  algorithm. 

ResBy{{t^ ,  sj),  {B,  sf))  then 

•  (i)  ResBy{{B,  sj),  {R,  s|))  is  false  and 

•  (ii)  ResBy{{R ,  s^),  {R,  sf))  is  false. 

Proof:  If  a  capsule  could  be  in  such  a  relationship  with  both  target  states,  the 
two  targets  states  would  stand  in  a  ResBy  relation  between  themselves.  Given  that 
the  two  target  states  have  the  same  creation  time  and  that  a  target  cannot  be  at  the 
same  time  in  two  places,  that  is  not  possible. 

For  the  algorithm  in  Figure  3.10  that  creates  new  paths  we  add  two  temporary 
fields  to  two  of  the  structures  only  for  the  purposes  of  the  algorithm.  The  first 
is  a  boolean  flag  named  mark  that  will  be  added  to  the  capsule  structure.  The 
second  is  a  pointer  to  the  original  capsule  that  will  be  added  to  every  triple  stored 
in  a  path.  In  the  first  phase,  every  agent  tries  to  fit  every  state  in  unused  capsules 
to  an  existing  path.  If  the  state  does  not  fit,  a  new  path  will  be  created,  starting 
at  that  state.  The  second  phase  the  agent  separates  the  paths  into  accurate  and 
semi-accurate  paths  according  to  the  number  of  sampling  agents  generating  them. 

Example:  Consider  a  case  in  which  Finding_new_paths  algorithm  receives  the 
following  set  of  capsules  as  unusedCapsules: 

Fet  us  assume  that  at  the  beginning  of  the  algorithm  shown  in  Figure  3.10, 
allPaths  does  not  contain  a  path  (line  2).  Considering  each  target  state  in  each 
capsule  (lines  3  to  5),  we  start  with  TargetStateA  of  capsule  0.  Because  allPaths 
does  not  contain  a  path,  a  new  path  will  be  created  with  TargetStateA  of  capsule 
0  at  its  head  (lines  16  to  18).  The  next  state,  TargetStateB  of  capsule  0,  will 
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Figure  3.9:  An  example  of  an  outcome  of  phase  1 


be  tested  to  see  if  it  is  in  a  ResBy  relation  with  any  of  the  tails  of  the  paths, 
stored  in  allPaths  (line  10).  It  does  not  stand  in  a  ResBy  relation  with  the  only 
tail  that  exists:  TargetStateA  of  capsule  0.  Therefore,  a  new  path  will  be  created 
with  Targets tateB  of  capsule  0  at  its  head  (lines  16  to  18).  TargetStateA  and 
TargetStateB  of  capsule  1  will  result  in  a  new  path  as  well.  However,  TargetStateA 
of  capsule  2  is  in  a  ResBy  relation  with  TargetStateA  of  capsule  0  and  will  join 
its  path  as  a  new  tail  (line  14).  TargetStateB  of  capsule  2  will  do  the  same  with 
TargetStateB  of  capsule  0.  At  the  end  of  the  first  phase  6  paths  will  be  formed,  3 
of  them  are  made  from  more  than  one  capsule. 

In  Figure  3.9  we  summarize  the  outcome  of  phase  I.  Each  arrow  corresponds 
to  a  ResBy  relation  between  two  path  points.  Every  capsule  contains  the  original 
sampler  state,  which  is  not  shown  in  the  figure  due  to  space  limitations. 

In  phase  2,  the  algorithm  finds  that  one  of  the  paths  in  allPaths  is  formed  by 
capsules  generated  by  agents  with  different  states.  This  path  is  an  accurate  path 
and  will  be  added  to  the  accurate  paths  set.  The  algorithm  removes  all  target  states 
that  share  the  same  capsule  with  accurate  paths’  target  states.  At  the  end  of  the 
second  path,  the  algorithm  looks  for  semi-accurate  paths.  A  semi-accurate  path  is 
a  path  of  target  states  sensed  by  the  same  agent  at  the  same  agent  state. 

The  function  pathToFunc  receives  a  path  and  returns  a  function  based  on  it.  In 
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our  case  it  will  receive  the  paths  shown  in  the  left  side  of  the  figure  and  return: 

=  ((50,  50)  +  (2,  3)  •  f,  (2,  3))  with  respect  to  the  first  and  the  last 
path  points:  tt,  =  (0,  ((0,  0),  0),  ((50,  50),  (2,  3)))  and  TTe  =  (3,  ((0, 10),  0), 

((56,  59),  (2,  3))). 

Proposition  3.2  The  finding  new  paths  algorithm  time  complexity  is  OiT^)  where 
T  is  the  number  of  targets  in  the  r  seconds  window  of  time  passing  through  the 
controlled  area. 

Proof:  In  this  algorithm,  every  leader  runs  over  a  set  of  paths  for  every  capsule 
in  its  set  of  capsules.  The  paths  and  the  capsule  sets  are  correlated  with  the  number 
of  targets  in  the  time  period  of  r,  and  therefore  results  in  a  time  complexity  of 
0(T2). 

3.3.3  The  movement  of  a  sampler  agent 

While  the  integration  algorithms  play  an  important  role  in  producing  an  accurate 
infoMap,  ultimately,  the  accuracy  of  the  infoMap  fundamentally  depends  on  the 
accuracy  of  the  observations  made  by  the  sampling  agents.  There  are  several  de¬ 
grees  of  freedom  associated  with  the  movements  of  sampler  agents.  At  this  point 
in  our  research  we  wanted  the  sampler  agents  to  move  autonomously  according  to 
a  predefined  algorithm  without  making  any  assumptions  regarding  target  location. 
We  hypothesize  that  the  following  criteria  should  be  considered  when  determining 
the  sampler  agent’s  behavior:  (i)  the  union  of  all  the  sensed  area  at  time  t  should 
be  maximized  and  (ii)  the  intersection  of  the  areas  sensed  by  sampling  agent  s  at 
time  t  and  at  t+\  should  be  minimized.  One  of  the  ways  to  achieve  this  is  to  move 
in  the  pattern  demonstrated  in  figure  3.11  (page  74).  We  refer  to  this  pattern  as  the 
Patrol  movement  pattern. 

We  compared  the  patrol  movement  pattern  with  a  steady  random  movement 
that  was  used  by  the  agents  in  [[Yadgar  2002,  Yadgar  2003]].  A  steady  random 
movement  is  defined  as  a  movement  in  a  random  direction  and  velocity.  Upon 
reaching  the  end  of  the  controlled  zone,  the  velocity  and  the  direction  is  changed 
and  re-directed  into  the  zone.  We  found  out  that  most  of  the  time  the  patrol  move¬ 
ment  pattern  is  more  efficient  than  the  random  one.  Hence,  we  will  present  our 
simulation  results  using  the  former. 
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Step  2  -  Finding  new  paths  algorithm 

In:  unusedCapsules 

Out:  updated  unusedCapsules,  accurateFunctions,  mediocrePaths 
%  phase  1 :  make  links 

(1)  sort(unusedCapsules)  %  by  time  stamp 

(2)  allPaths  =  {} 

(3)  for  eaeh  eapsule,  c  =  (t,  so,  {si,  S2})>  in  unusedCapsules 

(4)  eap.mark=false  %  marking  for  phase  2 

(5)  for  eaeh  target  state,  si,  in  eap  states 

(6)  linked=false 

(7)  %  beeause  of  the  above  assumption  and  given  that  the  path 

(8)  %  elements  eame  from  eapsules  there  will  be  only  one  suitable 

(9)  %path.  Therefore,  we  exit  the  loop  after  fi  nding  sueh  a  path 

(10)  for  every  last  triplet,  {tJast,  saJast,  sJast),  in  eaeh  path,  p, 

(1 1)  in  allPaths  &&  not  linked 

(12)  if  {ResBy{{tJast,  sJast),  {t,  si)) 

(13)  or  [t-last  =  f&fesoJasf  ^  so)) 

(14)  p  =  p^{{t,sa,si)) 

(15)  linked  =  true 

(16)  if  (not  linked) 

(17)  p  =  {{t,sa,si)) 

(18)  allPaths=allPaths  U  {p} 

(19) %  phase  2:  eolleet  target  representing  paths  that  has  no  eommon  eapsules 

(20) %  when  giving  a  greater  priority  to  paths  with  more  viewpoints 

(21)  sort{allPaths)  %  by  number  of  viewpoints 

(22)  paths={} 

(23) for  eaeh  path,  p,  in  allPaths 

(24)  if  (not  isAnyCapsuleMarked(p)  &  &  numberOfViewpoint(p)  >  1) 

(25)  markAllCapsules(p) 

(26)  unusedCapsules.=unusedCapsules  -  allCapsules(p) 

(27)  aeeurateFunetion=aeeurateFunetion 

(28)  aeeurateFunetions=  aeeurateFunetions  U  pathToFune(p) 

(29) if  aetivated  as  top-level  leader 

(30)  medioerePaths=eolleetMedioerePaths(fl;ZZPat/i5') 

(31) else 

(32)  medioerePaths={} 

(33)  Return  unusedCapsules,  accurateFunctions,  medioerePaths 


Figure  3.10:  Finding  new  paths  algorithm. 
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Figure  3.11:  Patrol  movement  pattern. 

3.4  Simulation,  experiments  and  results 

3.4.1  Simulation  environment 

For  this  work,  we  did  not  make  use  of  the  RadSim  simulator.  We  instead  devel¬ 
oped  a  separate  simulation  to  study  the  problems  associated  with  the  application 
of  the  DDM  in  a  large-scale,  mobile  agent  environment.  The  simulation  consists 
of  an  area  of  a  fixed  size  in  which  Dopplers  attempt  to  extract  the  object  state 
functions  of  moving  targets.  Each  target  has  an  initial  random  location  along  the 
border  of  the  area  and  an  initial  random  velocity  of  up  to  50  km.  per  hour  in  a 
direction  that  leads  inwards.  Targets  leave  the  area  when  reaching  the  boundaries 
of  the  zone.  Each  target  that  leaves  the  area  causes  a  new  target  to  appear  at  a 
random  location  along  the  border  and  with  a  random  velocity  in  a  direction  that 
leads  inwards.  Therefore,  each  target  may  remain  in  the  area  for  a  random  time 
period.  Eigures  3.12  and  3.13  are  screendumps  of  a  simulation  in  progress. 

To  run  our  simulations  we  used  a  Pentium  4  computer  with  Windows  2000  as 
an  operating  system  and  1GB  RAM. 

3.4.2  Evaluation  methods 

We  collected  the  state  functions  produced  by  agents  during  a  simulation.  We  used 
two  evaluation  criteria  in  our  simulations:  (1)  target  tracking  percentage  and  (2) 
average  tracking  time.  We  counted  a  target  as  tracked  if  the  path  identified  by  the 
agent  satisfied  the  following:  (a)  the  maximum  distance  between  the  calculated 
location  and  the  real  location  of  the  target  did  not  exceed  1  centimeter,  and  (b)  the 
maximum  difference  between  the  calculated  v(t)  vector  and  the  real  v(t)  vector 
was  less  than  1  centimeter  per  second. 
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Figure  3.12:  Simulating  2  Doppler  radars  traeking  30  targets.  The  dots  represent 
sampled  target  states.  The  shades  of  lines  represent  100%  and  50%  traeked  targets. 
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Figure  3.13:  Simulating  20  Doppler  radars  traeking  30  targets.  The  dots  represent 
sampled  target  states  and  the  lines  represent  traeked  targets. 
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In  addition,  the  identified  object  state  functions  were  divided  into  two  cate¬ 
gories.  One  category  involves  the  association  of  only  a  single  function  with  a 
particular  target,  chosen  to  be  part  of  the  infoMap;  those  functions  were  assigned 
a  probability  of  100%  and  corresponded  to  the  actual  object  state  function  (we 
refer  to  this  as  the  case  of  accurate  tracking).  The  other  category  involves  the 
association  of  two  possible  object  state  functions  with  a  particular  target.  In  that 
case,  each  state  function  was  assigned  a  50%  probability  of  corresponding  to  the 
actual  state  function.  We  refer  to  those  functions  as  semi-accurate.  We  will  say 
that  one  set  of  agents  did  better  than  another  if  it  reached  a  higher  tracking  per¬ 
centage  and  a  lower  tracking  time  with  respect  to  the  accurate  functions  and  the 
total  tracking  percentage  was  at  least  the  same. 

The  averages  reported  in  the  graphs  below  were  computed  for  one  hour  of 
simulated  time.  The  target  tracking  percentage  time  was  calculated  by  dividing 
the  number  of  targets  that  the  agents  succeeded  in  tracking,  according  to  the  above 
definitions,  by  the  actual  number  of  targets  during  the  simulated  hour.  We  consid¬ 
ered  only  targets  that  exited  the  controlled  zone.  The  tracking  time  was  defined 
as  the  time  that  the  agents  needed  to  find  the  object  state  function  of  the  target 
from  the  time  the  target  entered  the  simulation.  Tracking  average  time  was  calcu¬ 
lated  by  dividing  the  sum  of  tracking  time  of  the  tracked  targets  by  the  number  of 
tracked  targets. 

Basic  settings:  The  basic  setting  for  the  environment  corresponded  to  an  area 
of  10,000  by  10,000  meters.  In  each  experiment,  we  varied  one  of  the  parameters 
of  the  environment,  keeping  the  other  values  of  the  environment  parameters  as  in 
the  basic  settings. 

Each  Doppler  moved  one  second  and  stopped  for  5  seconds  to  take  5  measure¬ 
ments.  The  maximum  detection  range  of  a  Doppler  in  the  basic  setting  was  100 
meters;  the  number  of  Dopplers  was  1,000.  The  controlled  area  was  divided  into 
1,000  equal  rectangles,  each  400x250  squared  meters.  Each  patrolling  Doppler 
was  assigned  to  such  an  area  and  executed  the  patrol  movement  pattern.  1,000 
Dopplers  with  a  detection  range  of  100  meters  each,  can  cover  together  up  to 
approximately  8,000,000  squared  meters,  which  is  8%  of  the  controlled  area. 

The  number  of  targets  at  a  given  time  point  was  1,500.  In  total,  during  one 
hour  5,700  targets  entered  the  controlled  area  and  4,200  of  them  exited  the  area. 

In  the  basic  setting  we  used  a  hierarchy  of  4  levels:  three  levels  of  zone  group 
leaders  and  one  of  sampler  group  leaders.  Each  of  the  zone  group  leaders  divided 
its  zone  into  4  areas  and  assigned  a  sub-leader  to  each  one  of  them.  Therefore 
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Figure  3.14:  Tracking  percentage  by  time  in  zone  (Sec.). 


there  was  one  leader  at  the  top  level,  4  at  the  second  level,  16  at  the  third  and 
64  at  the  fourth.  Each  Doppler  sensor  communicated  with  one  of  the  fourth-level 
leaders. 

3.4.3  Results 

We  conducted  three  sets  of  tests:  (i)  evaluating  the  basic  settings,  (ii)  investigating 
the  influences  of  the  number  of  levels  in  the  hierarchy,  and  (iii)  studying  the  tol¬ 
erance  towards  faulty  sensing  agents,  leaders  and  sensing  noises.  At  this  state  of 
our  research,  samplers  and  leaders  do  not  react  to  the  changes  in  the  functioning 
agent  community. 

Basic  settings  results:  Our  hypothesis  was  that  by  applying  the  DDM  hierarchy 
model  we  would  be  able  to  very  quickly  track  many  targets.  We  also  hypothesized 
that  the  tracking  period  for  each  target  would  be  significant.  We  ran  the  simulation 
using  the  basic  settings  and  evaluated  the  results. 

Figure  3.14  shows  the  percentage  of  tracked  targets  as  a  function  of  the  time 
each  target  remained  in  a  zone.  To  put  this  histogram  in  context  we  added  the  gray 
graph  that  corresponds  to  the  right  legend.  This  graph  reflects  target  distribution 
with  respect  to  the  time  spent  in  the  zone. 
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Figure  3.15:  Time  to  track  distribution  (Sec.). 


We  can  see  that  the  system  accurately  tracked  83%  of  the  targets.  This  was 
achieved  with  Dopplers  covering  only  8%  of  the  area.  A  little  more  than  50%  of 
the  targets  that  stayed  in  the  controlled  zone  less  than  360  seconds  were  tracked. 
Note  that  most  of  the  targets  passing  through  the  simulated  area  remained  in  the 
area  less  than  720  seconds.  During  that  time  the  patrol  method  tracked  many 
targets  and  therefore  achieved  rapid  tracking. 

Figure  3.15  shows  the  number  of  targets  that  were  tracked  upon  entering  a 
zone.  Most  of  the  tracking  was  achieved  in  less  than  2  minutes  from  the  time  of 
a  target’s  entrance  into  the  zone.  The  system  tracked  71%  of  its  tracked  targets  in 
this  period. 

Figure  3.16  plots  the  tracking  duration,  which  is  the  period  of  time  between 
the  first  and  the  last  time  a  target  was  detected.  The  figure  indicates  that  the  system 
tracks  more  targets  for  less  duration.  However,  it  tracks  most  of  the  tracked  targets 
for  more  than  6  minutes. 

Level  comparison:  We  investigated  the  influence  of  the  number  of  levels  in 
the  hierarchy.  Our  hypothesis  was  that  too  few  levels  would  overload  the  leader 
agents  so  they  would  not  have  enough  time  to  process  the  information.  We  also 
anticipated  that,  as  more  leader  agents  were  involved  in  generating  the  global 
solution,  a  less  accurate  solution  would  result. 

Figure  3.17  presents  the  tracking  performance  of  the  system  as  a  function  of 
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Figure  3.16:  Tracking  duration  distribution  (Sec.). 


Figure  3.17:  Accurate  tracked  target  percentage  as  a  function  of  the  number  of 
levels. 
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Figure  3.18:  Accurate  tracking  time  (Sec.)  as  a  function  of  the  number  of  levels. 


the  number  of  levels  in  the  hierarchy.  As  we  hypothesized,  the  system  tracked 
less  targets  as  the  number  of  levels  increased.  This  can  be  explained  by  a  greater 
fragmentation  of  the  zone,  i.e.  4  quarters  in  2  levels  versus  64  in  4  levels.  The 
figure  shows  that  the  decrement  is  narrow. 

As  shown  in  Figure  3.18,  the  average  time  to  track  a  target  increases  as  the 
number  of  levels  increases.  However,  it  increases  only  from  100  seconds  to  106 
seconds  while  the  number  of  levels  increases  from  1  to  4. 

Figure  3.19  presents  the  duration  it  took  an  agent  to  perform  its  task.  In  this 
figure  we  present  the  maximum  time  when  using  the  computer  capabilities  as 
detailed  above.  The  maximum  time  is  very  close  to  the  average  time;  therefore 
we  do  not  present  the  latter  here.  As  we  predicted,  while  using  only  one  level  the 
agent  will  need  more  time  than  it  has.  In  our  case  an  agent  needed  16,000  seconds 
(about  5  hours)  to  process  data  gathered  during  1  hour.  That  means  that  in  the 
case  of  one  level  the  system  will  not  converge.  Using  2  levels  enabled  the  system 
to  solve  the  problem  in  only  35  minutes.  Using  4  levels  decreased  the  maximum 
time  that  an  agent  needed  to  process  data  collected  in  an  hour  to  only  10  minutes. 

In  Figure  3.20  we  show  the  total  number  of  bytes  transferred  between  agents 
during  one  hour,  relative  to  the  number  of  levels  in  the  hierarchy.  The  capsules 
generated  by  samplers  and  sent  to  sampler  group  leaders  resulted  in  a  transfer 
of  4Mb.  Having  a  massive  communication  load  may  cause  a  bottleneck  in  the 
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Figure  3.19:  Maximum  agent  process  time  (Sec.)  as  a  function  of  the  number  of 
levels. 

receiving  agent  that  may  lead  to  delays.  Moreover,  such  a  bottleneck  may  result 
in  a  loss  of  important  information  in  case  of  agents’  faults.  When  adding  more 
levels  to  the  hierarchy,  more  agents  transfer  information  upwards  and  therefore 
the  total  number  of  bytes  transferred  is  increased.  On  the  other  hand,  adding  more 
levels  decreases  the  average  number  of  bytes  every  agent  receives.  In  figure  3.21 
we  can  see  the  significance  of  the  reduction  of  the  average  communication  load 
on  the  receiving  agent  when  increasing  the  number  of  the  levels. 

We  used  a  hierarchy  formation  such  that  every  level  has  four  times  more  agents 
then  its  leader’s  level.  Therefore,  the  total  number  of  leader  agents  receiving 
information  in  the  hierarchy  is  1,  5,  21  and  85  when  using  hierarchy  of  1,2,3 
and  4  levels.  Figure  3.21  show  the  number  of  bytes  transferred  according  to  the 
number  of  agents. 

Dysfunctional  sampling  agents:  To  investigate  the  fault  tolerance  property 
of  the  hierarchy  model  in  a  large-scale  environment  we  disabled  some  of  the  sam¬ 
pling  agents.  We  increased  the  number  of  disabled  sampling  agents  from  0%  as 
in  the  basic  settings  to  90%,  leaving  only  10%  active  agents.  We  hypothesized 
that  by  increasing  the  number  of  faulty  sampling  agents  the  system  would  not 
perform  as  well  as  in  the  basic  settings.  Our  goal  was  to  place  a  bound  on  the 
number  of  dysfunctional  sampling  agents  that  the  system  could  tolerate  while  still 
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Figure  3.20:  Bytes  transferred  as  a  funetion  of  number  of  levels. 


performing  well. 

Figure  3.22  shows  the  aeeurate  traeked  targets  pereentage  as  a  funetion  of 
the  number  of  samplers  whieh  stopped  funetioning.  We  found  that  inereasing 
the  number  of  disabled  sampling  agents  also  inereases  the  time  it  takes  to  traek 
targets.  By  inereasing  the  number  of  disabled  sampling  agents  by  5%  the  average 
time  it  takes  to  traek  a  target  inereased  by  6%. 

Dysfunctional  leader  agents:  A  seeond  aspeet  of  the  system’s  fault  toleranee 
is  its  response  to  dysfunetional  leaders.  In  eontrast  to  dysfunetional  samplers,  a 
dysfunetional  leader  will  result  in  a  differenee  in  the  eoverage  of  the  system.  For 
example,  eonsider  a  ease  in  whieh  a  leader  responsible  for  half  of  the  eontrolled 
area  stops  funetioning.  Using  the  patrol  Doppler  movement  pattern  will  result  in  a 
loss  of  information  from  half  of  the  samplers.  We  hypothesized  that  performanee 
would  be  signifieantly  influeneed  by  this  faetor.  To  validate  this  hypothesis  we 
eondueted  several  simulations  in  whieh  we  varied  the  number  of  dysfunetional 
sampler  leaders. 

Figure  3.23  eonfirms  our  hypothesis.  It  shows  that  the  system  eould  tolerate 
a  reduetion  of  up  to  13%  in  the  number  of  funetioning  sampler  leaders.  A  redue- 
tion  of  18%  or  more  resulted  in  a  very  low  performanee  level.  However,  despite 
the  faet  that  the  system  demonstrated  poor  traeking  pereentage  for  high-rate  dys¬ 
funetional  sampler  leaders,  we  diseovered  that  it  still  traeked  targets  quiekly.  We 
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Figure  3.22:  Accurate  tracked  target  percentage  as  a  function  of  dysfunctional 
samplers. 
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Figure  3.23:  Accurate  tracked  target  percentage  as  a  function  of  dysfunctional 
first  level  leaders. 

hypothesize  that  adopting  a  reactive  approach  that  will  enforce  division  of  the  area 
among  the  active  agents  will  overcome  this  problem.  We  plan  to  report  the  results 
of  our  investigation  of  this  hypothesis  in  a  future  document. 

Noisy  communication:  As  we  stated,  we  would  like  to  show  that  using  sim¬ 
ple  and  cheap  sensors  may  be  beneficial  even  if  they  tend  to  malfunction  or  if 
communication  with  their  leaders  degrades.  We  conducted  a  thoughtful  simula¬ 
tion  testing  the  system  while  using  faulty  communication  between  samplers  and 
leaders.  We  predicted  that  the  system  would  be  tolerant  towards  such  noise. 

We  found,  as  shown  in  Figure  3.24,  that  even  if  50%  of  the  messages  did 
not  reach  their  destination  (either  because  of  faulty  communication  or  faulty  sam¬ 
plers),  the  system  still  performed  well.  Losing  50%  of  the  messages  resulted  in  a 
reduction  of  only  5%  of  the  tracked  targets  and  increased  the  tracking  time  by  20 
seconds. 

3.4.4  Small  Scale  Results 

We  also  tested  the  DDM  in  a  small-scale  environment  to  compare  various  prop¬ 
erties  of  the  suggested  solution.  We  tested  the  need  for  hierarchical  structure  and 
sensor  mobility. 
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Figure  3.24:  Accurate  tracked  target  percentage  of  patrol  as  a  function  of  lost 
communication  messages  between  samplers  and  leaders. 


Figure  3.25:  Target  tracking  percentage  and  average  time  by  the  settings. 
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Basic  Settings:  The  basic  setting  of  the  environment  was  1200  by  900  meters. 
In  the  experiments,  we  varied  one  of  the  parameters  of  the  environment,  keep¬ 
ing  the  other  values  of  the  environment  parameters  as  in  the  basic  settings.  The 
Dopplers  were  mobile  and  moved  randomly  as  before.  Each  Doppler  stopped  ev¬ 
ery  10  seconds,  varied  its  active  sensor  randomly,  and  took  10  measurements.  The 
maximum  detection  range  of  a  Doppler  in  the  basic  setting  was  200  meters;  the 
number  of  Dopplers  was  20  and  the  number  of  targets  at  any  given  time  was  30. 
The  DDM  hierarchy  consisted  of  only  one  level.  That  is,  there  was  one  sampler- 
leader  that  was  responsible  for  the  entire  area. 

We  first  varied  several  settings  of  both  the  hierarchy  model  and  the  sampling 
agents.  Each  setting  corresponded  to  (1)  whether  a  hierarchy  model  (H)  or  a  flat 
model  (E)  was  employed;  (2)  whether  the  sampler-agents  were  mobile  (M)  or 
static  (S);  and  (3)  whether  Dopplers  varied  their  active  sectors  from  time  to  time 
(V)  or  used  a  constant  one  all  the  time  (C).  In  the  flat  model  the  sampler  agents 
used  their  local  capsules  to  produce  task  state  functions  locally.  We  refer  to  a 
particular  combination  of  settings  by  combining  the  labels  corresponding  to  each 
setting;  for  example,  the  label  “ESC”  would  correspond  to  a  flat,  static  model  in 
which  Dopplers  kept  their  active  sensor  constant. 

Mobile  and  dynamic  vs.  static  Dopplers:  In  preliminary  simulations  we  ex¬ 
perimented  with  all  combinations  of  parameters  (l)-(3)  above.  In  each  setting, 
keeping  the  other  two  variables  fixed  and  varying  only  the  mobility  variable,  the 
mobile  agents  did  better  than  the  static  ones  (with  respect  to  the  evaluation  defini¬ 
tion  above). 

Hierarchy  vs.  flat  models:  We  examined  the  characteristics  of  4  different  set¬ 
tings:  (1)  ESC  that  involves  static  Dopplers  with  a  constant  active  sector  using  a 
non-hierarchical  model;  (2)  HSC  as  in  (A)  but  using  the  hierarchical  model;  (3) 
EMV  with  mobile  Dopplers  that  vary  their  active  sectors  from  time  to  time,  but 
with  no  hierarchy;  (4)  HMV  as  in  (3)  but  using  the  hierarchical  model.  We  tested 
ESC  in  two  experimental  settings:  one  in  which  Dopplers  were  located  randomly 
and  one  in  which  Dopplers  were  arranged  in  a  grid  formation  to  achieve  better 
coverage.  There  was  no  significant  difference  between  these  two  ESC  formations. 
Our  hypothesis  was  that  the  agents  in  HMV  would  do  better  than  in  all  the  other 
settings. 

The  first  finding  is  presented  in  the  left  part  of  Eigure  3.25  This  finding  in¬ 
dicates  that  the  setting  does  not  affect  the  overall  tracking  percentage  (i.e.,  the 


87 


tracking  percentage  of  the  50%  and  100%  functions).  The  difference  between 
the  settings  is  with  respect  to  the  division  of  the  detected  target  between  accu¬ 
rate  tracking  and  mediocre  tracking.  HMV  performed  significantly  better  than  the 
other  settings.  It  found  significantly  more  100%  functions  and  did  it  faster  than  the 
others.  This  supports  the  hypothesis  that  a  hierarchical  organization  leads  to  bet¬ 
ter  performance.  Further  support  for  a  hierarchical  organization  comes  from  HSC 
being  significantly  better  than  FMV  even  though,  according  to  our  preliminary 
results,  HSC  uses  Dopplers  that  are  more  primitive  than  the  Dopplers  FMV. 

Another  aspect  of  the  performance  of  the  models  is  the  average  tracking  time 
as  shown  in  the  right  part  of  Figure  3.25.  Once  again,  one  can  see  that  the  hier¬ 
archically  based  settings  lead  to  better  results.  We  found  that  by  considering  only 
targets  that  stayed  in  the  controlled  zone  at  least  60  seconds,  HMV  reached  87% 
tracking  percentage  and  83%  were  accurately  detected. 

We  also  studied  the  performance  of  hierarchies  with  two  levels:  one  zone 
leader  leading  four  sampling  leaders.  The  area  was  divided  equally  between  the 
four  sampling  leaders,  and  each  obtains  information  from  the  many  mobile  sam¬ 
pling  agents  located  in  its  area.  In  that  configuration  Dopplers  were  able  to  move 
from  one  zone  to  another;  Dopplers  changed  their  sampling  leader  every  time  they 
moved  from  one  zone  to  another.  Comparing  the  results  of  the  two  level  hierarchy 
simulations,  with  the  one  level  hierarchy  simulations  we  found  that  there  was  no 
significant  difference  in  performance  (with  respect  to  the  evaluation  definition)  of 
a  system  composed  of  two  levels  with  a  system  consisting  of  only  a  one  level  hi¬ 
erarchy.  However,  consistent  with  theorem  1,  the  computation  time  of  the  system 
was  much  lower. 

Communication  and  noise:  While  the  performance  of  the  hierarchy -based  mod¬ 
els  are  significantly  better  than  the  non-hierarchy  ones,  the  agents  in  the  hierarchy 
model  must  communicate  with  one  another,  while  no  communication  is  needed 
for  the  flat  models.  Thus,  if  no  communication  is  possible,  FMV  should  be  used. 
However,  with  communication,  messages  may  be  lost  or  corrupted.  Recall  that  the 
data  structure  exchanged  in  messages  is  the  capsule;  in  our  simulations  using  a  hi¬ 
erarchy  model,  each  sampling  agent  transmitted  168  bytes  per  minute.  We  studied 
the  influence  of  randomly  corrupted  capsules  on  the  HMV’s  behavior.  Figure  3.26 
shows  that  as  the  percentage  of  the  lost  capsules  increases  the  number  of  tracked 
targets  decreases;  however,  up  to  a  level  of  10%  noise,  the  detection  percentages 
decreased  only  from  74%  to  65%  and  the  accurate  tracking  time  increased  from  69 
seconds  to  only  80  seconds.  Noise  of  5%  results  in  a  smaller  decrease  to  a  tracking 
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accuracy  of  70%  while  the  traeking  time  is  slightly  inereased  to  71.  DDM  eould 
even  manage  with  noise  of  30%  and  traek  39%  of  targets  with  average  traeking 
time  of  1 15  seeonds.  In  the  rest  of  experiments  we  used  the  HMV  settings  without 
noise. 

Varying  the  number  of  Dopplers  and  targets:  We  examined  the  effeet  of  the 
number  of  Dopplers  on  performanee.  We  found  that,  when  the  number  of  tar¬ 
gets  is  kept  fixed,  as  the  number  of  Dopplers  inereases  the  pereentage  of  aeeurate 
traeking  inereases.  This  result  demonstrates  that  the  system  ean  make  good  use 
of  additional  resourees  that  it  might  be  given.  We  also  found  out  that  as  the  num¬ 
ber  of  Doppler  sensors  inereases,  the  50%  probability  paths  deerease.  That  may 
be  explained  by  the  fact  that  100%  paths  result  from  taking  into  eonsideration 
more  than  one  point  of  view  of  samples.  We  also  found  that  inereasing  the  num¬ 
ber  of  targets,  while  keeping  the  number  of  Dopplers  fixed  does  not  influence  the 
system’s  performance.  We  speeulate  that  this  is  beeause  an  aetive  seetor  eould 
distinguish  more  than  one  target  in  that  seetor. 

Maximum  detection  range  comparison:  We  also  tested  the  influenee  of  the 
deteeting  seetor  area  on  performanee.  The  basic  setting  uses  Dopplers  with  a 
deteetion  range  of  200  meters.  We  eompared  the  basie  setting  to  similar  ones 
with  deteetion  ranges  of  50,  100  and  150  meters.  We  found  that  as  the  maximum 
range  increases  the  traeking  pereentage  inereases  up  to  the  range  of  eovering  the 
entire  global  area.  As  the  maximum  radius  of  deteetion  increases  the  traeking 
average  time  deereases.  This  is  a  benefieial  property,  sinee  it  suggests  that  better 
equipment  (i.e.,  sensors  and  other  hardware)  will  lead  to  better  performanee. 

3.5  Related  work 

The  benefits  of  hierarchical  organizations  have  been  argued  by  many.  So  and 
Durfee  draw  on  eontingeney  theory  to  examine  the  benefits  of  a  variety  of  hierar- 
ehieal  organizations;  they  deseribe  a  hierarehieally  organized  network  monitoring 
system  for  task  deeomposition  and  they  also  eonsider  organizational  self-design 
[[So  1996]].  DDM  differs  in  its  organization  use  to  dynamieally  balanee  eompu- 
tational  load  and  also  in  its  algorithms  for  support  of  mobile  agents. 

The  idea  of  eombining  partial  loeal  solutions  into  a  more  eomplete  global 
solution  goes  baek  to  early  work  on  the  Distributed  Vehiele  Monitoring  Testbeds 
(DVMT)  [[Lesser  1987]].  DVMT  also  operated  in  a  domain  of  distributed  sensors 
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Figure  3.26:  Target  detection  percentage  and  average  time  as  function  of  the  com¬ 
munication  noise. 


Figure  3.27:  Tracking  percentage  and  average  time  as  a  function  of  the  number  of 
Dopplers. 
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that  tracked  objects.  However,  the  algorithms  for  support  of  mobile  sensors  and 
for  the  actual  specifics  of  the  Doppler  sensors  themselves  is  novel  to  the  DDM 
system.  Within  the  DVMT,  Corkill  and  Lesser  investigated  various  team  orga¬ 
nizations  in  terms  of  interest  areas  [[Corkill  1983]]  which  partitioned  problem 
solving  nodes  according  to  roles  and  communication,  but  they  were  not  initially 
hierarchically  organized  [[Ishida  1992,  Scott  1992]].  Wagner  and  Lesser  exam¬ 
ined  the  role  that  knowledge  of  organizational  structure  can  play  in  control  deci¬ 
sions  [[Wagner  2000]]. 

All  of  the  other  approaches  discussed  in  this  volume  assume  that  agents  are 
stationary.  Those  approaches  make  use  of  measurements  from  three  Doppler  sen¬ 
sors,  taken  at  the  same  time,  and  intersect  the  arcs  corresponding  to  each  sensor. 
The  intersection  method  depends  on  the  coordinated  action  of  three  Doppler  sen¬ 
sors  to  simultaneously  sample  the  target.  Such  coordination  requires  good  syn¬ 
chronization  of  the  clocks  of  the  sensors  and  therefore  communication  among  the 
Doppler  agents  to  gain  that  synchronization.  In  addition,  communication  is  re¬ 
quired  for  scheduling  agent  measurements.  We  have  described  an  alternative  that 
can  make  use  of  uncertain  measurements;  we  focus  on  the  combination  of  partial 
and  local  information.  Note,  that  even  though  our  agents  associate  a  time  stamp 
with  each  capsules,  DDM  does  not  require  that  the  sensors  are  fully  time  synchro¬ 
nized.  The  ResBy  relation  may  allow  small  deviation  of  the  time.  For  example: 
ResBy{{tl^  si) ^  (f2,s2))  may  be  si.?;  ==  s2.v&l{s1.x  —  tl  *  si.?;)  —  (s2.x  — 
t2  *  s2.v)  <  e  Using  a  large  e  may  indicate  high  tolerance  towards  non  synchro¬ 
nized  clocks.  However,  increasing  the  value  of  epsilon  increases  the  probability 
to  identify  two  different  targets  as  the  same  one. 

In  their  work,  Yu  and  Cysneiros  [[Yu  2002]]  describe  challenges  related  to 
large-scale  information  systems.  They  claim  that  large-scale  systems  have  the 
potential  to  support  greater  diversity,  offering  more  flexibility  and  better  robust¬ 
ness  as  well  as  more  powerful  functionalities  compared  to  traditional  software 
technologies.  In  our  work  we  address  these  challenges  and  propose  an  efficient 
solution. 

Silva  et  al,  have  developed  the  Reflective  Blackboard  architectural  pattern  for 
large-scale  systems  [[Silva  2002]].  This  is  the  result  of  the  composition  of  two 
other  well-known  architectural  patterns:  the  Blackboard  pattern  and  the  Reflection 
pattern.  They  separate  control  strategies  from  the  logic  and  data.  In  our  work  we 
use  independent  agents  that  act  autonomously.  Such  a  loose  coupling  is  beneficial 
in  terms  of  simplicity,  robustness  and  fault  tolerance. 

Tel  has  studied  the  performance  of  a  network  tree  with  n  processors  providing 
communication  between  every  pair  of  processors  with  a  minimal  number  of  links 
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(n-1)  [[Tel  1991]].  The  communication  complexity  in  a  tree  topology  is  influenced 
by  the  diameter  of  the  number  of  levels  in  the  tree.  Therefore  a  tree  with  fewer 
levels  will  have  a  better  communication  complexity.  However  each  node  has  more 
computations  to  perform  and  can  therefore  become  a  bottleneck.  A  failure  of  a 
node  will  split  the  tree  into  a  larger  number  of  unconnected  subsets.  In  the  work 
we  have  described,  we  have  investigated  the  relation  between  the  number  of  levels 
in  a  hierarchical  structure  and  performance;  we  have  presented  suggestions  of  how 
to  choose  the  right  number  of  levels. 


3.6  Conclusions 

We  have  shown  that  problems  involving  hundreds  and  thousands  of  Dopplers  and 
targets  cannot  be  solved  using  a  traditional  flat  architecture.  Our  approach  was 
based  on  distributing  the  solution  into  smaller  problems  that  could  be  solved  par¬ 
tially  by  simple  agents;  agents  are,  in  addition,  organized  hierarchically.  Using 
many  simple  and  cheap  agents  instead  of  a  much  smaller  number  of  sophisticated 
and  expensive  ones  may  also  be  cost-effective:  it  is  often  more  affordable  to  re¬ 
place  and  maintain  many  simple  agents  than  to  depend  on  a  few  sophisticated 
ones.  Our  approach  incorporated  methods  for  combining  partial  solutions  to  form 
a  global  solution,  and  involved  an  autonomous  movement  algorithm  that  could  be 
executed  by  each  sampling  agent. 

The  number  of  levels  in  a  hierarchy  was  shown  to  influence  accuracy.  As  the 
number  of  levels  increased  the  number  of  tracked  targets  dropped,  even  though 
this  drop  was  moderate.  However,  as  the  number  of  levels  increased,  the  time 
needed  by  every  agent  to  complete  its  mission  dropped  exponentially.  By  com¬ 
bining  these  two  results  we  were  able  to  balance  both  properties.  Choosing  the 
right  number  of  levels  should  also  take  into  consideration  the  time  it  takes  to  track 
targets.  As  we  have  shown,  it  takes  more  time  to  track  targets  as  the  number  of 
levels  in  the  hierarchy  is  increased. 

To  conclude,  we  have  shown  that  a  large-scale  ANTs  system  can  perform  well 
even  if  agents  are  very  simple  and  inaccurate.  We  have  shown  how  partial  infor¬ 
mation  can  be  combined  and  how  the  existence  of  dysfunctional  participants  can 
be  overcome. 
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Chapter  4 

Multi-channel  communications 
scheduling 


The  ANTs  challenge  problem  relies  on  wireless  communication  between  a  set 
of  sensors.  Each  sensor  is  equipped  with  a  low  bandwidth  transceiver  and  eight 
distinct  radio  channels  are  available  for  communication.  Each  sensor  can  be  pro¬ 
grammed  to  send  and  receive  on  any  of  the  eight  channels;  the  transmission  and 
reception  channels  of  each  can  be  different.  A  sensor  cannot  send  and  receive 
at  the  same  time.  Efficient  use  of  the  bandwidth  requires  good  synchronization 
between  sending  and  receiving  nodes  to  ensure  that  they  are  tuned  to  the  right 
channel  at  the  right  time. 

SRI  researched  scheduling  algorithms  based  on  graph  coloring  for  schedul¬ 
ing  communication  between  sensors.  This  approach  assumes  that  sensors  have 
a  limited  communication  range  so  that  transmissions  from  two  sufficiently  dis¬ 
tant  sensors  do  not  interfere,  even  if  they  use  the  same  band.  Unfortunately,  this 
assumption  is  not  valid  for  the  hardware  developed  for  the  ANTS  challenge  prob¬ 
lem. 

In  a  second  phase  of  the  project,  we  investigated  a  different  approach  to  shar¬ 
ing  the  RE  medium,  by  developing  a  channel  reservation  protocol  for  ANTs-like 
networks.  We  experimented  with  a  prototype  implementation  using  the  challenge 
problem  simulator.  Unfortunately,  performance  was  poor  because  of  the  impos¬ 
sibility  of  asynchronously  receiving  and  reacting  to  RE  messages  within  a  Java 
virtual  machine,  and  the  limitations  of  the  Java  scheduling  model. 
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4.1  Scheduling  Access  to  Radio  Channels 

4.1.1  Model 

A  network  of  sensors  that  communicate  via  wireless  links  can  be  modeled  natu¬ 
rally  as  a  graph  G  =  {V,  E): 

•  Each  vertex  vofV  represents  one  of  the  sensors. 

•  There  is  an  edge  between  two  vertices  v  and  v’  if  the  two  sensors  v  and  v' 
are  within  communication  range  of  each  other. 

The  set  of  neighbors  of  a  node  v,  that  is,  the  set  of  nodes  within  communication 
range  of  v  is  denoted  hy  E{v). 

This  model  assumes  that  communication  links  are  symmetric:  if  v  can  send 
to  v'  then  v'  can  send  to  v.  In  the  common  case,  the  sensors  are  placed  on  a  two 
dimensional  plane,  and  two  sensors  are  within  communication  range  of  each  other 
if  the  distance  between  them  is  less  than  a  bound  r.  This  gives  rise  to  a  special 
class  of  graphs  called  point  graphs  in  [Sen  and  Luson  1997]  or  disk  graphs. 

We  assume  that  a  schedule  is  composed  of  a  sequence  of  time  slots.  In  each 
time  slot,  a  node  is  either  idle,  sending,  or  receiving  a  message  on  a  specified 
channel.  The  schedule  must  ensure  that  sender  and  recipients  of  a  message  are 
synchronized  and  prevent  message  collisions.  Such  a  schedule  can  be  effective 
only  if  the  communication  pattern  between  the  sensors  is  sufficiently  regular  and 
predictable.  We  assume  that  the  communication  pattern  is  periodic  and  can  be 
described  as  follows.  During  each  period,  a  finite  set  M  of  messages  must  be 
transmitted.  Each  message  m  has  a  unique  sender  s{m)  G  V  and  a  set  of  recipients 
r(m)  C  V.  We  assume  that  the  nodes  of  r(m)  are  all  neighbors  of  s{m),  that  is, 
r{m)  C  E{s{m)).  In  other  words,  we  ignore  multihop  routing  issues. 

In  this  model,  a  communication  schedule  is  defined  by  the  number  n  of  time 
slots  it  contains  and  by  the  time  slot  t{m)  and  the  band  b{m)  assigned  to  each 
message  m  E  M.  The  time  slot  t{m)  is  an  integer  between  1  and  n,  and  b{m) 
is  one  channel.  In  time  slot  t{m),  node  s{m)  must  be  sending  m  on  band  b{m) 
and  all  nodes  in  r{m)  must  be  listening  on  band  b{m)  to  receive  the  message.  Eor 
these  mappings  to  define  a  valid  schedule,  the  two  following  constraints  must  be 
satisfied: 

1.  If  mi  7^  m2  and  t{mi)  =  t{m2)  then  the  two  sets  A{mi)  =  r(mi)  U 
{s(mi)}  and  A{m2)  =  r(m2)  U  {s(m2)}  are  disjoint. 
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2.  If  mi  m2,t{mi)  =  f(m2),  and  6(mi)  =  6(m2)  then  r(mi)n-E'(s(m2))  = 

0. 


The  first  constraint  guarantees  that  two  messages  sent  in  the  same  slot  have  no 
recipient  in  common,  that  their  senders  are  distinct,  and  that  the  sender  of  one  is 
not  an  intended  recipient  of  the  other.  The  second  constraint  prevents  collision 
between  two  messages.  If  mi  and  m2  are  sent  on  the  same  band  in  the  same  slot, 
then  the  recipients  of  mi  must  not  be  within  communication  range  of  s{m2),  the 
sender  of  m2. 

If  the  objective  is  to  maximize  throughput,  one  must  construct  two  mappings, 
b  and  s,  that  satisfy  the  above  constraints,  while  minimizing  the  number  of  slots 
n.  Alternatively,  a  bound  T  (i.e.,  the  communication  period)  may  be  specified  a 
priori,  and  one  attempts  to  construct  a  schedule  with  n  <  T.  Both  problems  can 
be  solved  using  general  CSP  algorithms. 

4.1.2  Special  Case:  Broadcast 

An  interesting  special  case  is  when  we  have  r(m)  =  E{s{m))  for  all  the  messages 
m.  In  this  case,  the  recipients  of  any  message  from  v  are  all  the  neighbors  of  v;  any 
message  from  v  is  broadcast  to  all  nodes  within  communication  range  of  v.  This 
mode  of  communication,  based  on  broadcast,  is  very  efficient  as  it  maximizes 
the  number  of  nodes  that  a  message  reaches  while  minimizing  the  number  of 
transmissions.  In  this  broadcast  mode,  it  is  easy  to  see  that  constraint  (1)  above 
implies  constraint  (2),  which  means  that  the  channel  assigned  to  any  message  m  is 
irrelevant.  Thus,  if  all  messages  from  every  node  are  broadcast  to  all  its  neighbors, 
there  is  no  advantage  in  using  more  than  one  channel.  Two  nodes  that  transmit  in 
the  same  time  slot  must  have  no  neighbor  in  common  and  can  then  use  the  same 
frequency  band. 

Given  a  graph  G  =  {V,  E)  representing  a  sensor  network,  let  =  (1/,  E') 
be  the  square  of  G.  There  is  an  edge  between  two  vertices  v  and  v'  of  V 'm  G‘^ 
if  and  only  if  the  distance  between  v  and  v'  in  G  is  at  most  two.  In  other  words, 
there  is  an  edge  between  v  and  v'  in  G^  if  v  and  v'  are  adjacent  or  have  a  common 
neighbor  in  G.  Since  the  band  assignment  b  is  irrelevant,  constructing  a  schedule 
amounts  to  assigning  a  time  slot  to  each  message  so  that  constraint  (1)  is  satisfied: 
if  mi  7^  m2  and  f(mi)  =  t{m2)  then  A{mi)  fl  A{m2)  =  0.  By  definition  of  G^, 
we  have  A  (mi)  n  A  (m2)  7^  0  if  and  only  if  s(mi)  and  s{m2)  are  adjacent  in  G^. 
If  the  communication  pattern  M  contains  exactly  one  message  from  each  node, 
we  can  identify  the  message  m  and  the  node  s{m).  In  this  case,  the  mapping  t 
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is  nothing  other  than  a  (proper)  vertex  eoloring  of  G^.  Construeting  a  sehedule 
amounts  then  to  finding  a  minimal  eoloring  of  G^.  Even  if  the  message  pattern  M 
eontains  distinet  messages  with  the  same  sender,  one  ean  transform  the  graph  G 
and  the  pattern  M  so  that  a  coloring  of  G^  gives  a  communication  schedule. 

Coloring  the  square  of  a  graph,  and  several  generalizations,  has  been  ex¬ 
tensively  studied  in  a  context  related  to  ours,  namely  the  assignment  of  non¬ 
interfering  radio  channels  to  base  stations  of  a  cellular  network.  If  G  has  max¬ 
imal  degree  A,  then  obviously  a  coloring  of  G^  requires  at  least  A  -I-  1  colors. 
Conversely,  it  is  easy  to  see  that  A^  colors  are  sufficient. 

Given  a  graph  G,  finding  a  minimal  coloring  for  G^  is  NP-hard  for  gen¬ 
eral  graphs  [McCormick  1983].  The  problem  becomes  polynomial  for  restricted 
classes  of  graphs  (for  example,  when  G  is  a  tree  or  is  obtained  by  assuming  all  the 
nodes  are  located  on  a  straight  line)  [Sen  and  Luson  1997].  On  the  other  hand,  the 
problem  is  known  to  be  NP-hard  for  point  graphs  [Sen  and  Luson  1997],  the  class 
of  graphs  that  model  ANTs-like  sensor  networks.  It  is  also  NP-hard  for  planar 
graphs  [Ramanathan  and  Lloyd  1992]. 

4.1.3  Algorithms  and  Experiments 

We  have  implemented  and  experimented  with  two  graph-coloring  algorithms  for 
constructing  communication  schedules.  These  algorithms  take  as  input  a  graph 
G,  compute  its  square,  and  attempt  to  construct  a  minimal  coloring  of  G^.  Our 
first  prototype  uses  the  Chaff  SAT  solver  [Moskewicz  et  al  2001].  It  encodes  the 
coloring  problem  into  a  boolean  satisfiability  problem  that  is  then  solved  by  Chaff. 
A  main  limitation  of  this  approach  is  that  the  boolean  solver  does  not  take  into 
account  the  many  symmetries  present  in  the  problem.  A  better  branch  and  bound 
coloring  algorithm  was  then  implemented.  It  uses  a  standard  heuristic  (DSATUR) 
and  starts  from  an  initial  partial  coloring  obtained  from  a  maximal  clique  of  G^. 

Example  results  from  these  algorithms  are  given  in  Table  4. 1 .  These  results 
were  obtained  for  a  fixed  set  of  100  sensors  placed  10  units  apart  on  a  10x10 
square  grid.  Different  graphs  G  were  then  obtained  by  varying  the  communication 
range  of  the  sensors.  Each  row  of  the  table  shows  the  maximal  degree  of  G  (that  is, 
the  largest  number  of  sensors  within  communication  range  of  a  single  node),  and 
the  minimal  number  of  colors  found  and  the  search  time  for  both  algorithm.  The 
branch  and  bound  algorithm  always  finds  the  minimal  coloring  on  these  examples, 
but  the  SAT  solver  fails  (after  several  hours  of  search)  on  the  last  example,  where 
the  communication  range  is  31  units. 

These  examples  and  further  experiments  show  that  straightforward  coloring 
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Range 

Max  Degree 

Coloring 

SAT  alg. 

Branch/Bound 

12 

4 

5 

0 

0 

15 

8 

9 

0 

0 

21 

12 

13 

0 

0 

23 

20 

23 

11.18s 

8.17s 

29 

24 

25 

0 

0 

31 

28 

33 

- 

81.42s 

Table  4.1:  Experiments  on  a  10  x  10  Sensor  Grid 


algorithms  can  find  optimal  communication  schedules  for  sensor  networks  of  up 
to  a  few  hundred  nodes,  provided  the  communication  range  (and  thus  the  maximal 
degree  of  the  graph  G)  is  relatively  small.  The  branch  and  bound  algorithm  is 
much  more  efficient  than  SAT  encoding  when  the  communication  range  increases. 

Although  these  initial  results  showed  the  potential  and  feasibility  of  construct¬ 
ing  communication  schedules  for  multichannel  wireless  sensor  networks,  we  did 
not  pursue  this  direction  further  because  of  a  mismatch  between  our  assumptions 
and  the  ANTs  challenge-problem  hardware.  The  sensors  developed  for  ANTs 
have  a  communication  range  that  is  very  large  compared  with  their  sensing  range. 
As  a  result,  in  any  practical  configuration,  all  the  sensors  are  within  communica¬ 
tion  range  of  each  other.  The  corresponding  graph  G  is  then  a  complete  graph, 
and  communication  scheduling  in  such  a  case  reduces  to  a  trivial  round  robin.  A 
second  limitation  is  that  scheduling  requires  an  a  priori  knowledge  of  the  com¬ 
munication  patterns  in  the  network.  Unfortunately,  it  is  very  difficult  to  predict  or 
bound  how  much  communication  will  occur  when  sensors  cooperate  and  negotiate 
using  complex  algorithms. 


4.2  A  Channel  Reservation  Protocol 

As  an  alternative  to  channel  scheduling,  we  investigated  an  access  control  protocol 
based  on  channel  reservation.  This  protocol  uses  channel  0  as  a  control  channel, 
while  the  other  channels  are  used  for  communicating  application  data  between 
sensors.  The  protocol  allows  a  sensor  to  reserve  one  channel  for  communication 
between  itself  and  a  set  of  other  sensors.  The  intent  of  this  protocol  is  primarily 
for  an  ANTs  tracker  agent  to  select  a  channel  on  which  it  will  receive  data  from  a 
set  of  sensing  agents  for  tracking  purposes. 
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Using  this  protocol,  each  node  in  the  network  maintains  a  table  T  that  keeps 
track  of  the  current  reception  channel  of  all  the  other  nodes.  A  network  node 
can  be  in  one  of  two  states:  in  the  default  idle  state,  a  node  is  listening  on  the 
control  channel,  namely  channel  0.  In  the  active  state,  a  node  has  been  allocated  a 
specific  reception  channel  (other  than  0)  and  is  listening  on  this  channel.  Initially, 
all  nodes  are  idle. 

When  a  node  v  becomes  active,  it  first  selects  a  free  channel  (if  any),  then  re¬ 
quests  to  be  allocated  that  channel  by  broadcasting  a  control  message  on  channel 
0.  If  no  message  collision  occurs,  all  currently  idle  nodes  receive  the  request  and 
one  of  them  sends  an  acknowledgment  on  the  control  channel.  The  reception  of 
this  acknowledgment  indicates  to  v  that  its  request  is  granted:  v  moves  to  the  ac¬ 
tive  state  and  switches  to  its  new  listening  channels.  Since  both  the  request  and  the 
acknowledgment  are  broadcast  on  channel  0,  all  the  idle  nodes  receive  them  and 
can  determine  that  v  has  changed  channel  and  update  their  table  T  accordingly. 
The  protocol  attempts  to  maintain  the  tables  T  of  all  the  idle  nodes  consistent  and 
up  to  date.  All  idle  nodes  can  then  determine  locally  the  channel  to  use  when  they 
need  to  send  data  to  v.  The  common  table  T  is  also  used  to  determine  which  node 
should  respond  to  requests  on  the  control  channel.  A  specific  node  xt  is  selected 
from  T  and  is  in  charge  of  responding  to  all  requests  sent  on  channel  0.  A  second 
node  ut  is  in  charge  of  responding  to  requests  that  originate  from  xt  itself.  Both 
xt  and  yx  are  selected  using  an  arbitrary  rule  among  the  idle  nodes  of  T  (e.g.,  the 
two  idle  nodes  of  smallest  indices).  An  active  node  cannot  keep  its  copy  of  T  up 
to  date  since  it  does  not  receive  the  control  messages  and  acknowledgments.  If 
an  active  node  needs  to  send  data  to  r;,  it  first  requests  an  up-to-date  copy  of  T  on 
channel  0,  this  fresh  copy  is  broadcast  by  xt- 

Message  collisions  may  occur  on  channel  0  if  two  nodes  send  requests  simul¬ 
taneously.  However  the  protocol  attempts  to  minimize  the  probability  of  collisions 
by  limiting  control  traffic.  Control  messages  (requests  and  responses  from  xt  or 
yx)  are  small  to  reduce  the  chance  of  collisions.  In  addition,  all  idle  nodes  listen 
to  the  control  traffic  and  can  determine  whether  or  not  channel  0  is  busy.  If  an 
idle  node  observe  that  a  request  has  been  sent  by  another  node  and  has  not  been 
acknowledged,  it  will  delay  its  own  requests  until  the  expected  acknowledgment 
is  transmitted.  These  mechanisms  reduce  but  do  not  eliminate  message  collisions 
on  the  control  channel.  However,  a  node  that  has  sent  a  request  can  determine  via 
a  timeout  mechanism  that  a  request  has  not  been  received  (when  no  acknowledg¬ 
ment  is  received)  and  thus  whether  it  has  succeeded.  A  random  backoff  and  retry 
mechanism  allows  a  node  to  resend  a  request  in  such  a  case. 

We  have  experimented  with  a  prototype  implementation  of  this  protocol  in 
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Java  for  the  challenge  problem  simulator  (radsim).  Unfortunately,  performance 
was  disappointing.  The  main  difficulty  was  that  Java  and  the  radsim  simulator  do 
not  give  any  means  to  quickly  react  and  respond  to  messages  received  on  the  con¬ 
trol  channel.  The  Java  implementation  is  multithreaded.  An  RF  message  is  first 
obtained  by  a  Java  listening  thread  that  polls  the  hardware  driver,  it  is  then  copied 
and  buffered  for  another  Java  thread  to  process  it.  Both  threads  compete  with 
each  other,  with  other  Java  threads  inside  the  JVM,  and  with  non-Java  processes 
running  on  the  same  machine.  The  delay  between  a  message  being  received  by 
the  hardware  and  its  processing  by  the  software  depends  on  how  and  when  the 
listening  thread  is  active.  The  JVM  and  operating  system  do  not  provide  ways 
to  accurately  schedule  the  listening  thread.  As  a  result,  we  never  managed  to 
reliably  make  the  responder  node  receive  all  requests  on  channel  0  and  send 
acknowledgments  in  a  timely  manner.  Delayed  acknowledgments  are  disastrous 
as  they  cause  many  requests  to  fail  and  can  lead  to  distinct  nodes  having  inconsis¬ 
tent  views  of  the  table  T.  Furthermore,  it  is  difficult  to  ensure  that  all  idle  nodes 
have  all  processed  the  same  messages  as  scheduling  delays  vary  across  machines, 
which  again  causes  inconsistencies. 


4.3  Paper  on  Dynamic  Scan  Scheduling 

A  paper  describing  the  results  of  our  previous  research  in  the  ANTs  program 
was  written  and  presented  at  the  IEEE  Real-Time  Systems  Symposium,  at  Austin 
Texas,  in  December  2002. 
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Chapter  5 

Summary  and  conclusions 


The  table  in  Figure  5.1  summarizes  our  contributions  to  the  solution  of  the  prob¬ 
lems  posed  in  Chapter  1.  We  repeat  here  the  desiderata  set  forth  in  Chapter  1. 

Distributed:  Any  resource  allocation  algorithm  should  be  distributed  in  the  sense 
that  it  should  not  depend  on  some  centralized  repository  of  global  informa¬ 
tion  where  allocation  decisions  must  be  made.  Such  an  assumption  would 
be  overly  restrictive  given  the  constraints  imposed  within  real  world  settings 
in  which  inter-agent  communications  may  be  limited. 

Incremental  and  realtime:  The  time-stressed  nature  of  realworld  problem  do¬ 
mains  precludes  the  possibility  of  computing  optimal  resource  allocations 
before  execution.  Instead,  agents  should  negotiate  partial,  good-enough  al¬ 
locations  which  can  later  be  refined  if  time  permits. 

Flexible  task  allocation:  Task  allocation  mechanisms  should  be  flexible  in  the 
sense  that  potential  allocations  can  be  explored  either  sequentially,  in  terms 
of  possible  task-resource  pairs,  or  combinatorially  in  the  form  of  sets  of 
multiple  tasks  and  resources.  In  the  latter  case,  mechanisms  should  be  able 
to  deal  with  tasks  that  can  interact:  that  is,  in  which  the  cost  of  doing  several 
tasks  is  not  simply  the  sum  of  the  individual  costs  of  each  task. 

Adaptive  resource  allocation:  Dynamic  problem  settings  in  which  new  tasks  or 
resources  can  appear  (or  disappear)  during  the  allocation  process  require 
that  it  not  be  necessary  for  allocation  processes  to  be  re-started  from  scratch 
each  time  the  global  situation  changes. 
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Adaptive  Communications:  Since  bandwidth  communications  are  assumed  to 
vary,  allocation  algorithms  should  be  able  to  adapt  to  limits  imposed  by  the 
communications  medium. 

Fault  tolerant:  Any  solution  should  be  fault  tolerant  in  the  sense  that  it  should 
be  adaptable  to  resource  loss  during  execution,  as  opposed  to  requiring  that 
the  allocation  be  re-started  from  scratch. 

Scalable:  Any  solution  should  be  scalable  to  very  large  agent  and  task  settings. 

The  left- most  column  of  Figure  5.1  refers  to  the  above  requirements.  The 
columns  summarize  the  various  algorithms  and  approaches  we  have  explored.  As 
shown,  the  best  results  were  obtained  with  either  the  dynamic  mediation  algorithm 
and  a  combination  of  a  team  commitment  architecture  and  monitoring  process;  or 
the  DDM  system  with  hierarchical  team  structure.  In  the  case  of  mediation,  we 
did  not  run  any  experiments  on  large  scale  systems.  DDM  is  less  flexible  than 
mediation  since  the  “negotiation”  is  kept  to  an  absolute  minimum. 
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Figure  5.1:  Meeting  the  desiderata:  note  that  this  table  highlights  the  major  ele¬ 
ments  (left-hand  eolumn)  addressed  by  eaeh  approaeh  (top  row)  examined  in  this 
report.  The  eolumn  “eombinatorial  alloeation”  refers  to  the  algorithm  deseribed 
in  Chapter  2. 
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